lttng-tools.git
4 years agotrigger: generate and add tracer token on registration
Jonathan Rajotte [Wed, 9 Sep 2020 21:16:53 +0000 (17:16 -0400)] 
trigger: generate and add tracer token on registration

Assign a unique tracer token to a trigger.

This token will be used as the unique id that will be communicated back
to the sessiond by the tracers for tracer notification.

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I2033dcaa4c5536b29dd4d7c57933e1aa686082cd

4 years agoaction-executor: add trigger name to debugging output
Jonathan Rajotte [Thu, 24 Sep 2020 19:14:47 +0000 (15:14 -0400)] 
action-executor: add trigger name to debugging output

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I73f8fda4b7fee331700988ea73471e3cb1516ed6

4 years agotrigger: implement trigger naming
Jonathan Rajotte [Mon, 23 Mar 2020 22:27:59 +0000 (18:27 -0400)] 
trigger: implement trigger naming

A trigger can now have an optional name on the client side.

If no name is provided the sessiond will generate a name and return a
trigger object to populate the client side object.

For now, the name generation code generate the following pattern: TN

Where `N` is incremented each time a name has to be generated. If a
collision occurs, we increment `N` as needed.

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I5f303610713c049177e53937bfc9824cd61501e4

4 years agoport: run namespace tests only on Linux
Michael Jeanson [Tue, 13 Oct 2020 23:19:10 +0000 (19:19 -0400)] 
port: run namespace tests only on Linux

Change-Id: I574d6e7419715e191fb9102e4cfc916ea0e529aa
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
4 years agoport: FreeBSD does support fchown and fchmod on a shm fd
Michael Jeanson [Wed, 4 Nov 2020 15:04:12 +0000 (10:04 -0500)] 
port: FreeBSD does support fchown and fchmod on a shm fd

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Iadff886d593ae3f77a4e96dfbfe02d1c1ea45f1e

4 years agoport: Add pthread_setname_np FreeBSD compat
Michael Jeanson [Thu, 29 Oct 2020 10:09:41 +0000 (06:09 -0400)] 
port: Add pthread_setname_np FreeBSD compat

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I7ca8334c4ce28bc240c898aeb5a6857ff951143b

4 years agoport: only enable userspace callstack context on Linux
Michael Jeanson [Wed, 14 Oct 2020 14:32:14 +0000 (10:32 -0400)] 
port: only enable userspace callstack context on Linux

Change-Id: I55402a7058f7d0bbe11d4c59197197130fe88665
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
4 years agotrigger: implement is_equal
Jonathan Rajotte [Tue, 24 Mar 2020 15:32:08 +0000 (11:32 -0400)] 
trigger: implement is_equal

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I646c13e7fb26fda66b888ce90253e87567b2cab8

4 years agotrigger: expose trigger owner uid
Jonathan Rajotte [Fri, 18 Sep 2020 20:37:50 +0000 (16:37 -0400)] 
trigger: expose trigger owner uid

To facilitate behavior management for the root user and to allow
duplicate trigger names across users, enforce the usage of the trigger
owner user id.

The root user will be able to register and unregister triggers on behalf
of other users. The root user will also have visibility on triggers of
other users.

Only the root user can use the `lttng_trigger_set_owner_uid` function
successfully. As indicated in the comments, this function performs
a client-side validation steps to catch mis-uses, but this is
properly enforced on the sessiond's end in the register/unregister
trigger commands.

With the future addition of a trigger name (id), the owner id and the
name will act as a key tuple allowing identicaly named triggers across
users.

We plan on exposing the `--user` switch in the upcoming command line
(add-trigger, remove-trigger).

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ifca3c41b7ffd97b67e16fb80c18472b667cb2f56

4 years agoClean-up: action-executor: typo and missing tab
Jonathan Rajotte [Fri, 22 May 2020 15:27:37 +0000 (11:27 -0400)] 
Clean-up: action-executor: typo and missing tab

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I5646fae2db98a3c8ffa0ba92aff9815f4ce0cf53

4 years agoTests: Fix: 99% fill ratio for high buffer usage is too high for larger events
Jonathan Rajotte [Thu, 28 May 2020 01:29:05 +0000 (21:29 -0400)] 
Tests: Fix: 99% fill ratio for high buffer usage is too high for larger events

If the event being registered is bigger than 1% of a subbuffer, the 99%
ratio cannot be achieved since the "last event" necessary to go over 99%
will always be dropped by the tracer.

e.g:
  DBG1 - 19:31:07.665963875 [Notification]: [notification-thread] High buffer usage condition being evaluated: threshold = 16220, highest usage = 16196 (in evaluate_buffer_usage_condition() at notification-thread-events.c:3733)

We use a ratio of 90% to keep a little headroom.

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I06180735e0b5e88209b888e51cc83b4ac7d98193

4 years agoFix: action: invalid header offset used when serializing snapshot action
Jonathan Rajotte [Wed, 8 Jul 2020 02:51:27 +0000 (22:51 -0400)] 
Fix: action: invalid header offset used when serializing snapshot action

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I77f5fab214f6721773147968ea3b85dddfea8d62

4 years agoport: FreeBSD has no ENODATA, alias it to ENOATTR
Michael Jeanson [Tue, 13 Oct 2020 21:33:44 +0000 (17:33 -0400)] 
port: FreeBSD has no ENODATA, alias it to ENOATTR

According to 'the internet' ENOATTR is used in a similar fashion to
ENODATA on the BSDs and we used it internally only anyway.

Change-Id: Ia4e77fd6d28c9dfb43f99ddba6c32369384827f0
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
4 years agoport: tests: /proc/self/fd is Linux only, use /dev/fd on other Unices
Michael Jeanson [Tue, 20 Oct 2020 19:02:45 +0000 (15:02 -0400)] 
port: tests: /proc/self/fd is Linux only, use /dev/fd on other Unices

Change-Id: I2be8120c7dce3f12daaf12a190810a145afa50b6
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
4 years agoCleanup: Use pkg-config to detect liburcu
Michael Jeanson [Fri, 30 Oct 2020 06:48:08 +0000 (02:48 -0400)] 
Cleanup: Use pkg-config to detect liburcu

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I88d3f853c8ee0e14a38a462ce24626800e0a4caf

4 years agoClean-up: sessiond: silence negative index warning
Jérémie Galarneau [Thu, 29 Oct 2020 15:43:35 +0000 (11:43 -0400)] 
Clean-up: sessiond: silence negative index warning

Coverity warns that `lttng_action_get_type()` can return
a negative index (LTTNG_ACTION_TYPE_UNKNOWN). This scenario
is not reachable, but a check is added to silence the analyzer.

Original report:
  1435955 Negative array index read

  A memory location at a negative offset from the beginning of the array
  will be read, resulting in incorrect values.

  In get_action_name: Negative value used to index an array in a read
  operation (CWE-129)

Reported-by: Coverity Scan
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I5952096a1d29f0d4a3c4350a2a842874d5f3973b

4 years agocredentials: uid and gid now use LTTNG_OPTIONAL
Jonathan Rajotte [Fri, 25 Sep 2020 20:35:28 +0000 (16:35 -0400)] 
credentials: uid and gid now use LTTNG_OPTIONAL

The triggers will only use the uid element.

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ia96e7def5ab560d9af1476920426635fc49f92ef

4 years agoport: Add missing sock_cred macros on FreeBSD
Michael Jeanson [Tue, 13 Oct 2020 23:06:09 +0000 (19:06 -0400)] 
port: Add missing sock_cred macros on FreeBSD

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I71f51ef61bf659c758edba6fd27faeef56654acf

4 years agoport: use compat lttng_fls()
Michael Jeanson [Tue, 13 Oct 2020 22:55:23 +0000 (18:55 -0400)] 
port: use compat lttng_fls()

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I698b31a24c5b442a00fe570a0ac53e23bb817bec

4 years agoport: FreeBSD has no LOGIN_NAME_MAX, use sysconf instead
Michael Jeanson [Tue, 13 Oct 2020 22:44:40 +0000 (18:44 -0400)] 
port: FreeBSD has no LOGIN_NAME_MAX, use sysconf instead

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Id058e15608ce0332500343ce389365a6fb1a40cc

4 years agoport: no eventfd support on FreeBSD
Michael Jeanson [Tue, 13 Oct 2020 22:44:19 +0000 (18:44 -0400)] 
port: no eventfd support on FreeBSD

It's only used in the tests to create dummy fds, use fcntl to duplicate
the stdout fd instead.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I401f2bfe6a2375a9bf4d895956071f74e5684783

4 years agooptional: Add LTTNG_OPTIONAL_INIT_VALUE
Jérémie Galarneau [Fri, 2 Oct 2020 21:25:11 +0000 (17:25 -0400)] 
optional: Add LTTNG_OPTIONAL_INIT_VALUE

Add helper to initialize an optional field to a 'set' value.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I439302ebec2433abcf7edb6167bf5b02db5a9a55

4 years agoaction: Mark parameter of lttng_action_get_type as const
Jonathan Rajotte [Wed, 23 Sep 2020 18:34:59 +0000 (14:34 -0400)] 
action: Mark parameter of lttng_action_get_type as const

Remove lttng_action_get_type_const as it is no longer needed.

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I1525bc2c89eb37ab3e75d915c6ff50bd2a7f5d21

4 years agoIntroduce lttng_domain_type_str utility
Jonathan Rajotte [Tue, 29 Sep 2020 15:46:24 +0000 (11:46 -0400)] 
Introduce lttng_domain_type_str utility

Change-Id: I1d2c7be968da6658e93407cdba26a6042177badd
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
4 years agoport: no HOST_NAME_MAX on FreeBSD, use LTTNG_HOST_NAME_MAX
Michael Jeanson [Tue, 13 Oct 2020 21:54:27 +0000 (17:54 -0400)] 
port: no HOST_NAME_MAX on FreeBSD, use LTTNG_HOST_NAME_MAX

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I83cc40a123539a668c25828144905b628df9fdef

4 years agoport: ELF_ST_TYPE is defined in elf.h on FreeBSD
Michael Jeanson [Tue, 13 Oct 2020 21:54:04 +0000 (17:54 -0400)] 
port: ELF_ST_TYPE is defined in elf.h on FreeBSD

No need to alias ELF32_ST_TYPE to ELF_ST_TYPE.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I8afa2fb9d96b81d994b90c8291f2f457a037a525

4 years agoport: posix_fadvise is available in FreeBSD >= 10.0
Michael Jeanson [Tue, 13 Oct 2020 21:32:14 +0000 (17:32 -0400)] 
port: posix_fadvise is available in FreeBSD >= 10.0

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I85f823ad7be94a5860ce0104c20e5a49ce030eda

4 years agoport: fix compat/endian.h on FreeBSD
Michael Jeanson [Tue, 13 Oct 2020 21:32:00 +0000 (17:32 -0400)] 
port: fix compat/endian.h on FreeBSD

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: If591ed8d1cf50c1914a613976e9e285c3647906c

4 years agoport: ls --ignore= is a GNU extension
Michael Jeanson [Wed, 14 Oct 2020 18:32:37 +0000 (14:32 -0400)] 
port: ls --ignore= is a GNU extension

Use grep -v instead to filter README.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I8fb6aba97ba1484aff511d59eeb4584e9672659e

4 years agoTests: poll: test all possible combinations of active fds in a poll set
Jérémie Galarneau [Tue, 27 Oct 2020 21:23:45 +0000 (17:23 -0400)] 
Tests: poll: test all possible combinations of active fds in a poll set

The poll compatibility layer used on all non-Linux platforms would
hang for certain combinations of active file descriptors reported
by poll.

A new test is introduced to try all combinations of active file
descriptors for a given number of file descriptors in a poll set.

The unit test tries all combinations of 8 file descriptors which
exercises all the current compatibility code and ensures the
test concludes rapidly.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ie479c4f2d85917713d3f2bdc1e4f0423ca9243af

4 years agoFix: common: poll: compat_poll_wait never finishes
Jérémie Galarneau [Fri, 16 Oct 2020 18:43:39 +0000 (14:43 -0400)] 
Fix: common: poll: compat_poll_wait never finishes

compat_poll_wait hangs when poll returns an array of file
descriptors of the form:
  [ Inactive Active ]

The logic to find the first idle pollfd entry is bogus and actually
skips the first idle entry. This causes the follow-up loop to never
conclude.

The pollfd array defragmentation logic is re-written in a simpler
style to handle those cases appropriately.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I8669a870df1ec1160f05e35e83671917bb80d6f9

4 years agoTests: Add syscall enable/disable scenarios
Michael Jeanson [Thu, 3 Sep 2020 15:04:46 +0000 (11:04 -0400)] 
Tests: Add syscall enable/disable scenarios

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ic3d9e739b01a0cf2bffb7c103911b3b51520010e

4 years agoCleanup: simplify 'poll' wrapper build
Michael Jeanson [Wed, 26 Aug 2020 15:39:15 +0000 (11:39 -0400)] 
Cleanup: simplify 'poll' wrapper build

Remove the AM conditionnal and merge the sources in single files like the
other wrappers. This removes a special case from the build system.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I6078f6013e52c3bc7c74cb8937f3741453c65874

4 years agoCleanup: autoconf 'dirfd' detection
Michael Jeanson [Wed, 26 Aug 2020 15:17:10 +0000 (11:17 -0400)] 
Cleanup: autoconf 'dirfd' detection

Remove the unused AM conditionnal and use the 'HAVE_' prefix for the
define like the other detected features.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I9a001051a14e2360e7f66fd4f627f97b11563c4f

4 years agoSet version to 2.13-pre
Michael Jeanson [Tue, 6 Oct 2020 14:24:56 +0000 (10:24 -0400)] 
Set version to 2.13-pre

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ib63daa19b91c4cd94caf4fd6cbdfd6fd1e8f015b

4 years agorelayd: silence null dereference warning during viewer stream creation
Jérémie Galarneau [Fri, 16 Oct 2020 12:25:10 +0000 (08:25 -0400)] 
relayd: silence null dereference warning during viewer stream creation

Coverity warns that the vstream's trace chunk may be used NULL.
However, this won't happen if the corresponding relay stream has
an active trace chunk.

Coverity report:
  1433620 Dereference after null check

  Either the check against null is unnecessary, or there may be a
  null pointer dereference.

  In viewer_stream_create: Pointer is checked against null but then
  dereferenced anyway (CWE-476)

Reported-by: Coverity Scan
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ie032ed415a99cfff149e3325d05f37ededb52d33

4 years agoFix: relayd: failure to read index entry or stream packet after clear
Jérémie Galarneau [Wed, 7 Oct 2020 18:10:35 +0000 (14:10 -0400)] 
Fix: relayd: failure to read index entry or stream packet after clear

Observed issue
==============

The clear tests occasionally fail with the following babeltrace error
when a live session is stopped following a "clear". Unfortunately, this
problem only seems to occur on certain machines. In my case, I only
managed to reproduce this on the CI's workers.

  10-07 12:39:48.333  7679  7679 E PLUGIN/SRC.CTF.LTTNG-LIVE/VIEWER lttng_live_get_stream_bytes@viewer-connection.c:1610 [lttng-live] Received get_data_packet response: error
  10-07 12:39:48.333  7679  7679 E PLUGIN/CTF/MSG-ITER request_medium_bytes@msg-iter.c:563 [lttng-live] User function failed: status=ERROR
  10-07 12:39:48.333  7679  7679 E PLUGIN/CTF/MSG-ITER ctf_msg_iter_get_next_message@msg-iter.c:2899 [lttng-live] Cannot handle state: msg-it-addr=0x5603c28e2830, state=DSCOPE_TRACE_PACKET_HEADER_BEGIN
  10-07 12:39:48.333  7679  7679 E PLUGIN/SRC.CTF.LTTNG-LIVE lttng_live_iterator_next_handle_one_active_data_stream@lttng-live.c:845 [lttng-live] CTF message iterator failed to get next message: msg-iter=0x5603c28e2830, msg-iter-status=ERROR
  10-07 12:39:48.333  7679  7679 E PLUGIN/SRC.CTF.LTTNG-LIVE lttng_live_msg_iter_next@lttng-live.c:1665 [lttng-live] Error preparing the next batch of messages: live-iter-status=LTTNG_LIVE_ITERATOR_STATUS_ERROR
  10-07 12:39:48.333  7679  7679 W LIB/MSG-ITER bt_message_iterator_next@iterator.c:864 Component input port message iterator's "next" method failed: iter-addr=0x5603c28cb0f0, iter-upstream-comp-name="lttng-live", iter-upstream-comp-log-level=WARNING, iter-upstream-comp-class-type=SOURCE, iter-upstream-comp-class-name="lttng-live", iter-upstream-comp-class-partial-descr="Connect to an LTTng relay daemon", iter-upstream-port-type=OUTPUT, iter-upstream-port-name="out", status=ERROR
  10-07 12:39:48.333  7679  7679 E PLUGIN/FLT.UTILS.MUXER muxer_upstream_msg_iter_next@muxer.c:454 [muxer] Upstream iterator's next method returned an error: status=ERROR
  10-07 12:39:48.333  7679  7679 E PLUGIN/FLT.UTILS.MUXER validate_muxer_upstream_msg_iters@muxer.c:991 [muxer] Cannot validate muxer's upstream message iterator wrapper: muxer-msg-iter-addr=0x5603c28dbe70, muxer-upstream-msg-iter-wrap-addr=0x5603c28cd0f0
  10-07 12:39:48.333  7679  7679 E PLUGIN/FLT.UTILS.MUXER muxer_msg_iter_next@muxer.c:1415 [muxer] Cannot get next message: comp-addr=0x5603c28dc960, muxer-comp-addr=0x5603c28db0a0, muxer-msg-iter-addr=0x5603c28dbe70, msg-iter-addr=0x5603c28caf80, status=ERROR
  10-07 12:39:48.333  7679  7679 W LIB/MSG-ITER bt_message_iterator_next@iterator.c:864 Component input port message iterator's "next" method failed: iter-addr=0x5603c28caf80, iter-upstream-comp-name="muxer", iter-upstream-comp-log-level=WARNING, iter-upstream-comp-class-type=FILTER, iter-upstream-comp-class-name="muxer", iter-upstream-comp-class-partial-descr="Sort messages from multiple inpu", iter-upstream-port-type=OUTPUT, iter-upstream-port-name="out", status=ERROR
  10-07 12:39:48.333  7679  7679 W LIB/GRAPH consume_graph_sink@graph.c:473 Component's "consume" method failed: status=ERROR, comp-addr=0x5603c28dcb60, comp-name="pretty", comp-log-level=WARNING, comp-class-type=SINK, comp-class-name="pretty", comp-class-partial-descr="Pretty-print messages (`text` fo", comp-class-is-frozen=0, comp-class-so-handle-addr=0x5603c28c8140, comp-class-so-handle-path="/home/jenkins/jgalar-debug/build/usr/lib/babeltrace2/plugins/babeltrace-plugin-text.so", comp-input-port-count=1, comp-output-port-count=0
  10-07 12:39:48.333  7679  7679 E CLI cmd_run@babeltrace2.c:2548 Graph failed to complete successfully
  10-07 12:39:48.333  7679  7679 E PLUGIN/SRC.CTF.LTTNG-LIVE/VIEWER lttng_live_session_detach@viewer-connection.c:1227 [lttng-live] Unknown detach return code 0

  ERROR:    [Babeltrace CLI] (babeltrace2.c:2548)
    Graph failed to complete successfully
  CAUSED BY [libbabeltrace2] (graph.c:473)
    Component's "consume" method failed: status=ERROR, comp-addr=0x5603c28dcb60,
    comp-name="pretty", comp-log-level=WARNING, comp-class-type=SINK,
    comp-class-name="pretty", comp-class-partial-descr="Pretty-print messages
    (`text` fo", comp-class-is-frozen=0, comp-class-so-handle-addr=0x5603c28c8140,
    comp-class-so-handle-path="/home/jenkins/jgalar-debug/build/usr/lib/babeltrace2/plugins/babeltrace-plugin-text.so",
    comp-input-port-count=1, comp-output-port-count=0
  CAUSED BY [libbabeltrace2] (iterator.c:864)
    Component input port message iterator's "next" method failed:
    iter-addr=0x5603c28caf80, iter-upstream-comp-name="muxer",
    iter-upstream-comp-log-level=WARNING, iter-upstream-comp-class-type=FILTER,
    iter-upstream-comp-class-name="muxer",
    iter-upstream-comp-class-partial-descr="Sort messages from multiple inpu",
    iter-upstream-port-type=OUTPUT, iter-upstream-port-name="out", status=ERROR
  CAUSED BY [muxer: 'filter.utils.muxer'] (muxer.c:991)
    Cannot validate muxer's upstream message iterator wrapper:
    muxer-msg-iter-addr=0x5603c28dbe70,
    muxer-upstream-msg-iter-wrap-addr=0x5603c28cd0f0
  CAUSED BY [muxer: 'filter.utils.muxer'] (muxer.c:454)
    Upstream iterator's next method returned an error: status=ERROR
  CAUSED BY [libbabeltrace2] (iterator.c:864)
    Component input port message iterator's "next" method failed:
    iter-addr=0x5603c28cb0f0, iter-upstream-comp-name="lttng-live",
    iter-upstream-comp-log-level=WARNING, iter-upstream-comp-class-type=SOURCE,
    iter-upstream-comp-class-name="lttng-live",
    iter-upstream-comp-class-partial-descr="Connect to an LTTng relay daemon",
    iter-upstream-port-type=OUTPUT, iter-upstream-port-name="out", status=ERROR
  CAUSED BY [lttng-live: 'source.ctf.lttng-live'] (lttng-live.c:1665)
    Error preparing the next batch of messages:
    live-iter-status=LTTNG_LIVE_ITERATOR_STATUS_ERROR
  CAUSED BY [lttng-live: 'source.ctf.lttng-live'] (lttng-live.c:845)
    CTF message iterator failed to get next message: msg-iter=0x5603c28e2830,
    msg-iter-status=ERROR
  CAUSED BY [lttng-live: 'source.ctf.lttng-live'] (msg-iter.c:2899)
    Cannot handle state: msg-it-addr=0x5603c28e2830,
    state=DSCOPE_TRACE_PACKET_HEADER_BEGIN
  CAUSED BY [lttng-live: 'source.ctf.lttng-live'] (msg-iter.c:563)
    User function failed: status=ERROR
  CAUSED BY [lttng-live: 'source.ctf.lttng-live'] (viewer-connection.c:1610)
    Received get_data_packet response: error

This occurs immediately following a 'stop' on the session. As the error
indicates, a request to obtain a data packet fails with a generic
error reply.

Moreover, the following LTTNG_VIEWER_DETACH_SESSION appears to fail
with an invalid status code. This is addressed in a different commit.

Reproducing the test's failure without redirecting the relay daemon's
allows us to see the following errors after the first stop:
  PERROR - 14:33:44.929675253 [25108/25115]: Failed to open fs handle to ust/uid/1001/64-bit/index/chan_0.idx, open() returned: No such file or directory (in fd_tracker_open_fs_handle() at fd-tracker.c:550)
  PERROR - 14:33:45.030037417 [25108/25115]: Failed to open fs handle to ust/uid/1001/64-bit/index/chan_0.idx, open() returned: No such file or directory (in fd_tracker_open_fs_handle() at fd-tracker.c:550)
  PERROR - 14:33:45.130429370 [25108/25115]: Failed to open fs handle to ust/uid/1001/64-bit/index/chan_0.idx, open() returned: No such file or directory (in fd_tracker_open_fs_handle() at fd-tracker.c:550)
  PERROR - 14:33:45.230829447 [25108/25115]: Failed to open fs handle to ust/uid/1001/64-bit/index/chan_0.idx, open() returned: No such file or directory (in fd_tracker_open_fs_handle() at fd-tracker.c:550)
  PERROR - 14:33:45.331223320 [25108/25115]: Failed to open fs handle to ust/uid/1001/64-bit/index/chan_0.idx, open() returned: No such file or directory (in fd_tracker_open_fs_handle() at fd-tracker.c:550)

This is produced with the following back-trace:
  (gdb) bt
  #0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
  #1  0x00007ffff69648b1 in __GI_abort () at abort.c:79
  #2  0x00005555555b4f1f in fd_tracker_open_fs_handle (tracker=0x55555582c620, directory=0x7fffe8006680,
      path=0x7ffff0a25870 "ust/uid/1001/64-bit/index/chan_1.idx", flags=0, mode=0x7ffff0a24508) at fd-tracker.c:550
  #3  0x0000555555595c34 in _lttng_trace_chunk_open_fs_handle_locked (chunk=0x7fffe0002130, file_path=0x7ffff0a25870 "ust/uid/1001/64-bit/index/chan_1.idx",
      flags=0, mode=432, out_handle=0x7ffff0a24710, expect_no_file=true) at trace-chunk.c:1388
  #4  0x0000555555595eef in lttng_trace_chunk_open_fs_handle (chunk=0x7fffe0002130, file_path=0x7ffff0a25870 "ust/uid/1001/64-bit/index/chan_1.idx", flags=0,
      mode=432, out_handle=0x7ffff0a24710, expect_no_file=true) at trace-chunk.c:1433
  #5  0x00005555555da6c2 in _lttng_index_file_create_from_trace_chunk (chunk=0x7fffe0002130, channel_path=0x7fffe8018c30 "ust/uid/1001/64-bit",
      stream_name=0x7fffe8018c10 "chan_1", stream_file_size=0, stream_file_index=0, index_major=1, index_minor=1, unlink_existing_file=false, flags=0,
      expect_no_file=true, file=0x7fffe0002270) at index.c:97
  #6  0x00005555555dad8a in lttng_index_file_create_from_trace_chunk_read_only (chunk=0x7fffe0002130, channel_path=0x7fffe8018c30 "ust/uid/1001/64-bit",
      stream_name=0x7fffe8018c10 "chan_1", stream_file_size=0, stream_file_index=0, index_major=1, index_minor=1, expect_no_file=true, file=0x7fffe0002270)
      at index.c:186
  #7  0x000055555557640f in try_open_index (vstream=0x7fffe0002250, rstream=0x7fffe8018c50) at live.c:1378
  #8  0x0000555555577155 in viewer_get_next_index (conn=0x7fffd4001440) at live.c:1643
  #9  0x0000555555579a01 in process_control (recv_hdr=0x7ffff0a27c30, conn=0x7fffd4001440) at live.c:2311
  #10 0x000055555557a1db in thread_worker (data=0x0) at live.c:2482
  #11 0x00007ffff6d1c6db in start_thread (arg=0x7ffff0a28700) at pthread_create.c:463
  #12 0x00007ffff6a45a3f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

That problem is mostly cosmetic in nature (the open can fail
"legitimately") as the PERROR should simply not be printed and is
addressed in a different commit.

This error is also produced after a 'clear' is issued:
  PERROR - 14:33:45.532782268 [25108/25115]: Failed to read from file system handle of viewer stream id 1, offset: 4096: No such file or directory (in viewer_get_packet() at live.c:1849)

Which is produced with the following back-trace:
  #0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
  #1  0x00007f53e297c8b1 in __GI_abort () at abort.c:79
  #2  0x000055dd77ccef2c in viewer_get_packet (conn=0x7f53c4001100) at live.c:1850
  #3  0x000055dd77cd0a15 in process_control (recv_hdr=0x7f53dca3fc30, conn=0x7f53c4001100) at live.c:2315
  #4  0x000055dd77cd11db in thread_worker (data=0x0) at live.c:2483
  #5  0x00007f53e2d346db in start_thread (arg=0x7f53dca40700) at pthread_create.c:463
  #6  0x00007f53e2a5da3f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

A similar problem occurs, although more rarely, when reading an
index entry in viewer_get_next_index().

Cause
=====

The following situation leads to both failures to get a
packet and failures to get the next index:
  - Viewer connects to an existing session,
  - Viewer consumes a number of packets, alternating the
    GET_NEXT_INDEX and GET_PACKET command,
  - The session's streams are rotated to a new trace chunk
    (as part of a clear),
  - The session is started and stopped, causing new packets
    to be produced and received,
  - The session is stopped and destroyed, causing the session's
    streams to rotate into a "null" trace chunk (no active
    trace files),
  - Viewer issues GET_NEXT_INDEX or GET_PACKET, but the fact
    that a rotation occurred on the receiving end is not detected
    as the relay streams' trace chunk are "null".

The crux of the problem is that lttng_trace_chunk_ids_equal() is
bypassed when the current trace chunk of a relay stream is "null".

The rationale for skipping this check is that it is assumed that the
files currently opened by the live server can can still be used even
if the consumer has rotated the corresponding streams into a 'null'
trace chunk, meaning no trace chunk is 'set' for those streams.

This makes sense in one scenario: the session was destroyed and we wish
to allow a connected live client to finish consuming the trace packets
up to the end of the session's lifetime.

Here, the situation is different. The viewer is reading chunk 'A'.
Meanwhile, a rotation occurs into chunk 'B' and packets are received for
chunk 'B'. Then, a rotation to a 'null' chunk (no active chunk) occurs.

In essence, the live server never sees the rotation between chunk 'A'
and 'B', and simply assumes that a rotation from 'A' to 'null' occurred,
as would happen at the end of a session.

In terms of the code, in viewer_get_next_index(), a call to
check_index_status() is performed to determine if an index is available.
The function checks that `index_received_seqcount` is greater than
`index_sent_seqcount`. In that case, it determines that an index must be
available.

Unfortunately, there is no way for the live server to determine that the
remaining indexes are in a chunk that doesn't exist anymore (chunk 'B').
Thus, viewer_get_next_index() attempts to read an index entry from the
current index file and fails.

Solution
========

1) lttng_trace_chunk_ids_equal() is modified to properly handle
'null' trace chunks:
  - A null and a non-null trace chunk are not equal,
  - Two null trace chunks are equal.

2) Rotation count
  A rotation counter is introduced to track the number of rotations
  that occurred during a relay stream's lifetime. This counter is
  sampled by the matching viewer streams on creation and on rotation
  and is used to determine if all rotations were "seen" by the viewer
  stream.

  Hence, this allows us to handle the special case where a viewer
  is consuming the contents of a relay stream that just transitioned
  into a 'null' trace chunk (see comments in patch).

The rest of the modifications simply allow the live server to handle
null trace chunks in viewer streams. This fixes another unrelated bug
that I observed while investigating this: sessions that don't have an
active trace chunk are not shown when listing sessions with babeltrace.

To reproduce, simply stop, clear a session, and attempt to list the
sessions of the associated relay daemon.

Known drawbacks
===============

None.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ibb3116990e34b7ec3b477f3482d0c0ff1e848d09

4 years agoFix: lttng-ctl: erroneous uses of LTTNG_PACKED
Jérémie Galarneau [Tue, 13 Oct 2020 18:55:33 +0000 (14:55 -0400)] 
Fix: lttng-ctl: erroneous uses of LTTNG_PACKED

The LTTNG_PACKED macro uses gcc attributes to indicate that a structure
should be packed. Hence, this macro obeys the same rules as the gcc
attribute.

Various mis-uses of the LTTNG_PACKED macros may result in structure not
being packed:
  - The LTTNG_PACKED macro should always be placed _before_ an identifier
    when a structure is declared in-place.
  - Adding LTTNG_PACKED at the definition site has no effect if the
    structure was declared elsewhere.

Those mis-uses cause issues when mixing the bitness (32/64) of the
session daemon and liblttng-ctl.

Outstanding issues include the following structures that are not
tagged as LTTNG_PACKED:
  - struct lttng_event
  - struct lttng_channel
  - struct lttng_event_context

Unfortunately, those structures are exposed by the public API and
can't be tagged as being "packed". Doing so would break the ABI
of liblttng-ctl.

These structures should be packed/unpacked explicitly.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I095dc0dffc6bf9e15dc7a7ec797958a5780ef150

4 years agoFix: relayd: live: invalid return code on DETACH_SESSION
Jérémie Galarneau [Fri, 9 Oct 2020 16:04:10 +0000 (12:04 -0400)] 
Fix: relayd: live: invalid return code on DETACH_SESSION

Babeltrace 2 reports an invalid return code being returned in reply to a
DETACH_SESSION command.

Reviewing the relevant Babeltrace 2 code, the logging can only be
produced if the reception of the lttng_viewer_detach_session_response
structure succeeds.

This elemininated my first guess that this was caused by the relay
daemon closing the socket before sending the reply. In that case, an
invalid status code of '0' could have been erroneously returned as a
status code since the recv() call on the socket would return 0.

It turns out that on a failure to return a packet, viewer_get_packet()
returns an error status code, but also sends a zero-initialized payload
buffer of the size of the requested packet.

This causes live clients which detach following the error of the
GET_PACKET command to interpret the still-enqueued zero-initialized
buffer as a reply to the DETACH_SESSION command. Since zero is not a
valid status code, it is correctly interpreted as a protocol error.

The reply_size is set to the header's size to only transmit the header
when an error reply is sent.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I69ed74f83404a16353d2bdbaa9f3adcdc2a03892

4 years agoTests: clear: remove test workspace directory
Jérémie Galarneau [Thu, 8 Oct 2020 22:15:34 +0000 (18:15 -0400)] 
Tests: clear: remove test workspace directory

The clear tests only removes its workspace's subdirectory, but
leaves an empty directory behind. Remove the wildcard and remove
the root of the workspace on clean-up.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I551f892af5423c6ed5933beb0c1a13f41a30a26e

4 years agoTests: ns_contexts: discarded events result in test failure
Jérémie Galarneau [Mon, 21 Sep 2020 21:24:50 +0000 (17:24 -0400)] 
Tests: ns_contexts: discarded events result in test failure

A follow-up change makes all events emited by gen-ust-events
a bit larger, which causes them to no longer fit in the default
channel configuration's buffers.

This causes the test to fail occasionnaly when the consumer daemon
fails to consume the packets fast enough to leave room in the
buffers for new events.

The test doesn't need to produce 10,000 events; reducing to
1,000 produced events makes no material difference and works
around the problem.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ie87583bb9bb9cdd813f80443231a65164ef67df1

4 years agoFix: PERROR spam when `tracing` group does not exist
Jérémie Galarneau [Tue, 15 Sep 2020 20:22:14 +0000 (16:22 -0400)] 
Fix: PERROR spam when `tracing` group does not exist

The session daemon prints a PERROR on launch when the tracing group does
not exist. This should not occur when the group simply does not exist as
this is not an error. In that case (ESRCH), a DBG statement is
sufficient.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I3ade29071a8f4e9fe2eb56bf05ff4150b70fd463

4 years agoBuild fix: implicit declaration of function 'PERROR' on Solaris
Jérémie Galarneau [Fri, 11 Sep 2020 03:57:54 +0000 (23:57 -0400)] 
Build fix: implicit declaration of function 'PERROR' on Solaris

Solaris 10/11 CI builds complain that PERROR is not declared
in memstream.h. This addresses the warning on both platforms.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I09ce7cea216e075d0ef7256f738ea7400d193b6b

4 years agotests: unit: event-rule unit testing
Jonathan Rajotte [Thu, 28 Nov 2019 19:19:16 +0000 (14:19 -0500)] 
tests: unit: event-rule unit testing

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ic25fac00027b9b62c7d9dd060e75228aa7e86c57

4 years agoevent-rule: introduce event rule tracepoint
Jonathan Rajotte [Wed, 11 Mar 2020 17:58:48 +0000 (13:58 -0400)] 
event-rule: introduce event rule tracepoint

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I77495a7b0e359e6f513dfc862dec65b4946d08ba

4 years agoevent-rule: introduce event rule uprobe
Jonathan Rajotte [Wed, 11 Mar 2020 17:56:36 +0000 (13:56 -0400)] 
event-rule: introduce event rule uprobe

This is the "--userspace-probe" option of the enable-event command line.

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I43478cb75bd8fdc28e2927d53e0a9a09f2283dea

4 years agoevent-rule: introduce event rule syscall
Jonathan Rajotte [Wed, 11 Mar 2020 17:55:16 +0000 (13:55 -0400)] 
event-rule: introduce event rule syscall

This is the "--syscall" option of the enable-event command line.

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I9c36cb0440e1f26a3b436782eb437a8ddffbe4fa

4 years agoevent-rule: introduce event-rule kprobe
Jonathan Rajotte [Wed, 11 Mar 2020 17:52:38 +0000 (13:52 -0400)] 
event-rule: introduce event-rule kprobe

This is the "--probe" option of the enable-event command line.

Change-Id: I6d763df53d5e838ea2266d49b1f22bc23a9addb1
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
4 years agoevent-rule: lttng_event_rule base object
Jonathan Rajotte [Wed, 11 Mar 2020 17:50:08 +0000 (13:50 -0400)] 
event-rule: lttng_event_rule base object

A lttng_event_rule object is the base object representing an event-rule.

We plan on using the event-rule object for compositing with a new
condition type.

This also paves the way toward further improvement of the lttng_event
related APIs.

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I5bbfa3b75337f040cf7e565447d4a4af590ed043

4 years agoIntroduce kernel-probe locations
Jonathan Rajotte [Mon, 24 Aug 2020 19:50:49 +0000 (15:50 -0400)] 
Introduce kernel-probe locations

Kernel probe can be configured by two type of location.

The first one is via address:
lttng_kernel_probe_location_address_create.

The second one is using a symbol name combined with an offset.

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Icd280e7a8403c761987472f0a416ef5188a2068d

4 years agouserspace-probe: replace explicit null-termination check
Jérémie Galarneau [Thu, 10 Sep 2020 16:23:28 +0000 (12:23 -0400)] 
userspace-probe: replace explicit null-termination check

Replace explicit null-termination checks by uses of
lttng_buffer_view_contains_string() which provides the same
guarantees and ensures the string pointer is within the view.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: If0038a82ad7dfe1ed0e8cdef7870d5e25d62200d

4 years agoRevert "userspace-probe: replace explicit null-termination check"
Jérémie Galarneau [Fri, 11 Sep 2020 15:10:44 +0000 (11:10 -0400)] 
Revert "userspace-probe: replace explicit null-termination check"

This reverts commit b9e63e21bd01c0deeaec2195ba912e38460bc038.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I909339dbdcf4a7931ad95c5b4b6af152aa52487c

4 years agoTests: clean-up: remove trailing dot in snapshot test statements
Jérémie Galarneau [Thu, 10 Sep 2020 16:29:50 +0000 (12:29 -0400)] 
Tests: clean-up: remove trailing dot in snapshot test statements

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I95208d67b18ba806f009b9a6b53d5e2cc38b6c74

4 years agouserspace-probe: replace explicit null-termination check
Jérémie Galarneau [Thu, 10 Sep 2020 16:23:28 +0000 (12:23 -0400)] 
userspace-probe: replace explicit null-termination check

Replace explicit null-termination checks by uses of
lttng_buffer_view_contains_string() which provides the same
guarantees and ensures the string pointer is within the view.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: If0038a82ad7dfe1ed0e8cdef7870d5e25d62200d

4 years agouserspace-probe: log function name on invalid parameter error
Jérémie Galarneau [Thu, 10 Sep 2020 16:22:59 +0000 (12:22 -0400)] 
userspace-probe: log function name on invalid parameter error

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I31bb1e7af0cde85a7ddb4614796cae1da3e5b1b9

4 years agoAllow run-as to generate filter bytecode.
Jonathan Rajotte [Fri, 29 Nov 2019 21:12:28 +0000 (16:12 -0500)] 
Allow run-as to generate filter bytecode.

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I3742bb00b7c753b7d256cdff1889a5e90865608b

4 years agoFix: add missing errno.h in pthread compat
Michael Jeanson [Wed, 26 Aug 2020 18:30:07 +0000 (14:30 -0400)] 
Fix: add missing errno.h in pthread compat

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Idf574e6e0c7f535693149f3c88f1784188fb14da

4 years agoAdd common util to set thread name
Michael Jeanson [Wed, 12 Aug 2020 21:08:07 +0000 (17:08 -0400)] 
Add common util to set thread name

Use the same code to set all the thread names and fail gracefully on
platforms that don't support them.

This sets the minimum requirement for thread names on Linux to Glibc
>= 2.12, which seems reasonable.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I61c0b9adb6c2309fed91b5a1b11ebc5ee2a637ce

4 years agoFix: liblttng-ctl: unchecked return value on buffer append
Jérémie Galarneau [Fri, 21 Aug 2020 18:28:54 +0000 (14:28 -0400)] 
Fix: liblttng-ctl: unchecked return value on buffer append

Allocation failures can cause lttng_dynamic_buffer_append to fail;
its result should always be checked.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Id1870b1e19d3451afdd1e992355d83d4028b5723

4 years agoFix: action executor: double work list unlock on error
Jérémie Galarneau [Fri, 21 Aug 2020 18:16:10 +0000 (14:16 -0400)] 
Fix: action executor: double work list unlock on error

The action executor executes its queued work items without holding the
work list lock. This can result in a double-unlock when a fatal error
occurs during the processing of a work item.

A check for the "should_quit" flag is added before unlocking since this
is the only other reason for the thread to exit its loop.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I8d6b1a3511174abd08848ac3677cdf8a326fa8c5

4 years agoMove filter related code to libfilter under libcommon
Jonathan Rajotte [Thu, 28 Nov 2019 21:09:16 +0000 (16:09 -0500)] 
Move filter related code to libfilter under libcommon

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I006dcac801ce1ec742de3a03f21a1c5f8b698298

4 years agoClean-up: consumer: consumer_metadata_cache_write is not const-correct
Jérémie Galarneau [Thu, 20 Aug 2020 19:40:10 +0000 (15:40 -0400)] 
Clean-up: consumer: consumer_metadata_cache_write is not const-correct

`data` is used as a source argument and can be marked as `const`.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I9b2b32fe253e1b89a2605ed9c859a04141b321d5

4 years agoFix: memcpy used on potentially overlapping regions
Jérémie Galarneau [Thu, 20 Aug 2020 19:38:18 +0000 (15:38 -0400)] 
Fix: memcpy used on potentially overlapping regions

Caught by reviewing unrelated code, these two uses of memcpy
can operate on overlapping buffers. I checked all other uses
of "raw" memcpy and those appear safe.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I72b1204bc52a92015042adb6a67b022d140f5b4e

4 years agosessiond: notification: use lttng_payload for communications
Jonathan Rajotte [Tue, 14 Jul 2020 18:56:37 +0000 (14:56 -0400)] 
sessiond: notification: use lttng_payload for communications

Allows passing of fds related to object (e.g userspace probes).

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I7bb3a91c71016b2939b0e05aca60d57c2da14a20

4 years agoFix: sessiond: client/client_list lock inversion on disconnect
Jérémie Galarneau [Tue, 18 Aug 2020 20:01:30 +0000 (16:01 -0400)] 
Fix: sessiond: client/client_list lock inversion on disconnect

Coverity reports a lock inversion scenario in
handle_notification_thread_client_disconnect() where a client's lock is
held while acquiring the client list lock. This is indeed a problem.

As indicated in the notification_client and notification_client_list
comments, the locking was shoe-horned to make it possible for the action
executor to enqueue notifications in a client's outgoing queue and flush
it.

Since this is the only access pattern that is supported, the client
locking is reworked slightly to only acquire the client lock when
checking the "active" flag, interacting with the outbound communication
state, and sending through a client's socket.

This change makes the client locking regions more narrow which accounts
for the somewhat large number of lines affected.

The updates to the `active` flag on error are moved to the function that
flushes the outbound queue instead of expecting the callers to set it.
This allows the locking to be limited to this function rather than
relying on the callers.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I8632d0f7785ec727dabd329bdfba010fd5e4643a

4 years agoFix: sessiond: missing rcu read lock on client in/out events
Jérémie Galarneau [Mon, 17 Aug 2020 20:55:39 +0000 (16:55 -0400)] 
Fix: sessiond: missing rcu read lock on client in/out events

Users of get_client_from_sock() must hold the RCU read lock
for the duration of the use of the notification_client.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I644e549187ee47c959eeb692e27be111343d8979

4 years agosessiond: enforce user-exclusive session access in session_access_ok
Jérémie Galarneau [Fri, 14 Aug 2020 20:59:18 +0000 (16:59 -0400)] 
sessiond: enforce user-exclusive session access in session_access_ok

The current session_access_ok logic disallows the access to a session
when:
  uid != session->uid && gid != session->gid && uid != 0

This means that any user that is part of the same primary group as the
session's owner can access the session. The primary group is not
necessarily (and most likely) not the `tracing` group. Moreover, the
`tracing` group is not meant to provide shared access to sessions, but
to allow interactions with a root session daemon.

For instance:
  - the session has uid = 1000, gid = 100
  - the current user has uid = 1001, gid = 100

access to the session is granted.

This is way too broad and unexpected from most users as the LTTng
documentation never mentions this "primary group share tracing sessions"
behaviour. The documentation only alludes to the fact that separate
users have "their own set of sessions".

On most distributions, this change will have no impact as `useradd`
creates a new group for every user. Users will never share a primary
group and thus can't control each others' sessions.

However, it is not unusual to have users share a primary group (e.g.
`users`) and set the default umask to `0700`. In that case, there is no
expectation that every user will share files and there would be no
reasonable expectation that they should share all sessions.

For instance, it would be unexpected for one user to tear down the
sessions of other users with a single `lttng destroy -a` command.

If this type of session sharing is desirable to some users, then the
default umask of users could be checked or sessions could be created as
part of a group. However, in doubt, it is preferable to be strict.

This is not marked as a fix since this was most likely deliberate and
the change could, although unlikely, break existing deployment
scenarios.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I98f7ffb29d5f6dcb9d660535c1d3f5a1d1a68293

4 years agosessiond: trigger: run trigger actions through an action executor
Jérémie Galarneau [Tue, 11 Feb 2020 04:29:18 +0000 (23:29 -0500)] 
sessiond: trigger: run trigger actions through an action executor

The `action executor` interface allows the notification subsystem to enqueue
work items to execute on behalf of a given trigger. This allows the notification
thread to remain responsive even if the actions to execute are blocking (as
through the use of network communication).

Before this commit, the notification subsystem only handled `notify` actions;
handling code for new action types are added as part of the action executor.

The existing `notify` action is now performed through the action executor so
that all actions can be managed in the same way.

This is less efficient than sending the notifications directly, but could be
optimized trivially (if it ever becomes a problem) when:
  - the action is a group containing only a `notify` action,
  - the action is a `notify` action.

Overview of changes to existing code
===

Managing the new action types requires fairly localized changes to the existing
notification subsystem code. The main code paths that are modified are the sites
where `evaluation` objects are created:
  1) on an object state change (session or channel state changes, see
     handle_notification_thread_channel_sample and
     handle_notification_thread_command_session_rotation),
  2) on registration of a trigger (see
     handle_notification_thread_command_register_trigger),
  3) on subscription to a condition (see client_handle_message_subscription).

To understand the lifetime of most objects involved in a work deferral to the
action executor, see the paragraph in notification-thread-internal.h (line 82)
to understand the relation between clients and client lists.

1) Object state changes

As hinted in the notification_client_list documentation, deferring work on a
state change is straight-forward: a reference is taken on a client list and the
list is provided to the action executor as part of a work item.

Hence, very little changes are made to the the two state-change handling sites
beyond enqueuing a work item rather than directly sending a notification.

2) Subscription to a condition

A notification client can subscribe to a condition before or after a matching
trigger (same condition and containing a notify action) has been registered.

If no matching trigger were registered, no client list exists and there is
nothing to do.

If a matching trigger existed, a client list (which could be empty) will already
exist and the client is simply added to the client list. However, it is
important to evaluate the condition for the client (as the condition could
already be true) and send the notification to that client only and not to all
clients in the list.

Before this change, since everything was done in the same thread, a temporary
list containing only the newly-subscribed client was created on the stack and
the notification was sent/queued immediately. After sending the condition, the
client was removed from the temporary list and added to the "real" client list.

This strategy cannot be used with the action executor as the "temporary" client
list must exist beyond the scope of the function. Moreover, the notification
subsystem assumes that clients are in per-condition client lists and that they
can safely be destroyed when they are not present in any list.

Fortunately, here we know that the action to perform is to `notify` and nothing
else. The enqueuing of the notification is performed "in place" by the
notification thread without deferring to the action executor.

3) Registration of a trigger

When a client subscribes to a condition, the current state of that condition is
immediately evaluated. If the condition is true (for instance, a channel's
buffer are filled beyond X% of their capacity), the action associated with the
trigger is executed right away.

This path requires little changes as a client list is created when a trigger is
registered. Hence, it is possible to use the client list to defer work as is
done in `1`.

4) Trigger registration

Since the `notify` action was the only supported action type, the notification
subsystem always created a client list associated with the new trigger's
condition.

This is changed to only perform the creation (and publication) of the client
list if the trigger's action is (or contains, in the case of a group) a `notify`
action.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I43b54b93c1244591aeff6e0d0fa8076c7b5e0c50

4 years agoRevert "Fix: sessiond: erroneous user check logic in session_access_ok"
Jérémie Galarneau [Mon, 17 Aug 2020 20:06:16 +0000 (16:06 -0400)] 
Revert "Fix: sessiond: erroneous user check logic in session_access_ok"

This reverts commit 4064563ea326f6f26d2c458009beb9ebdb3ba840.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ifbcfd0c18631cfca70b2ea85b16824bb26a2a446

4 years agoRevert "sessiond: trigger: run trigger actions through an action executor"
Jérémie Galarneau [Mon, 17 Aug 2020 20:06:14 +0000 (16:06 -0400)] 
Revert "sessiond: trigger: run trigger actions through an action executor"

This reverts commit d1ba29d290281cf72ca3ec7b0222b336c747e925.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I70d57fef86aea94a590720689af751e2554184d0

4 years agoFix: sessiond: erroneous user check logic in session_access_ok
Jérémie Galarneau [Fri, 14 Aug 2020 20:59:18 +0000 (16:59 -0400)] 
Fix: sessiond: erroneous user check logic in session_access_ok

The current session_access_ok logic disallows the access to a session when:
  uid != session->uid && gid != session->gid && uid != 0

This means that any user that is part of the same primary group as the session's
owner can access the session. The primary group is not necessarily (and most
likely) not the `tracing` group.

For instance:
  - the session has uid = 1000, gid = 100
  - the current user has uid = 1001, gid = 100

access to the session is granted.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I2e9208286e5508315dae90cb25d34133ca5edcc0

4 years agosessiond: trigger: run trigger actions through an action executor
Jérémie Galarneau [Tue, 11 Feb 2020 04:29:18 +0000 (23:29 -0500)] 
sessiond: trigger: run trigger actions through an action executor

The `action executor` interface allows the notification subsystem to enqueue
work items to execute on behalf of a given trigger. This allows the notification
thread to remain responsive even if the actions to execute are blocking (as
through the use of network communication).

Before this commit, the notification subsystem only handled `notify` actions;
handling code for new action types are added as part of the action executor.

The existing `notify` action is now performed through the action executor so
that all actions can be managed in the same way.

This is less efficient than sending the notifications directly, but could be
optimized trivially (if it ever becomes a problem) when:
  - the action is a group containing only a `notify` action,
  - the action is a `notify` action.

Managing the new action types requires fairly localized changes to the existing
notification subsystem code. The main code paths that are modified are the sites
where `evaluation` objects are created:
  - on an object state change (session or channel state changes, see
    handle_notification_thread_channel_sample and
    handle_notification_thread_command_session_rotation),
  - on registration of a trigger (see
    handle_notification_thread_command_register_trigger),
  - on subscription to a condition (see client_handle_message_subscription).

To understand the lifetime of most objects involved in a work deferral to the
action executor, see the paragraph in notification-thread-internal.h (line 82)
to understand the relation between clients and client lists.

Overview of changes
===

Object state changes

Change-Id: I23290e94d98e781992661f0aee88de9986ed274f
---

As hinted in the notification_client_list documentation, defering work on a
state change is straight-forward: a reference is taken on a client list and the
list is provided to the action executor as part of a work item.

Hence, very little changes are made to the the two state-change handling sites
beyond enqueuing a work item rather than directly sending a notification.

Subscription to a condition
---

A notification client can subscribe to a condition before or after a matching
trigger (same condition and containing a notify action) has been registered.

When a client subscribes to a condition, it is a added to a corresponding
"client list"

Registration of a trigger
---

When a client subscribes to a condition, the current state of
that condition is immediately evaluated. If the condition is true
(for instance, a channel's buffer are filled beyond X% of their
capacity),

TODO:

Change-Id: I7f9bc197715c9ca008a4f1fcd4c86e01b6252dce
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
4 years agoFix: notification: deadlock on cmd_queue.lock and client->lock
Jonathan Rajotte [Tue, 5 May 2020 20:06:58 +0000 (16:06 -0400)] 
Fix: notification: deadlock on cmd_queue.lock and client->lock

Observed issue
==============

Deadlock between the notification thread and the action executor thread.

Thread 5 holds cmd_queue.lock and request the client lock.
Thread 6 holds the client lock and request the cmd_queue lock.

Thread 5 have little value in holding the queue lock considering it effectively to a "pop" of the cmd_queue.

Thread 9 is waiting on the cmd_queue lock but does not hold any other
locks and thus not part of the deadlock but is a casualties of this
deadlock and leave a client "hanging".

Other threads are all in their respective waiting state.

Thread 9 (Thread 0x7f76f2ffd700 (LWP 240467)):
 #0  __lll_lock_wait (futex=futex@entry=0x1ad1308, private=0) at lowlevellock.c:52                                                                                                                                                 [1070/1123]
 #1  0x00007f77052c80a3 in __GI___pthread_mutex_lock (mutex=0x1ad1308) at ../nptl/pthread_mutex_lock.c:80
 #2  0x00000000004611dd in run_command_wait (handle=0x1ad12f0, cmd=0x7f76f2fe31e0) at notification-thread-commands.c:31
 #3  0x000000000046143a in notification_thread_command_unregister_trigger (handle=0x1ad12f0, trigger=0x7f76e4000ef0) at notification-thread-commands.c:148
 #4  0x00000000004444af in cmd_unregister_trigger (cmd_ctx=0x7f76e4000d40, sock=68, notification_thread=0x1ad12f0) at cmd.c:4618
 #5  0x0000000000483d23 in process_client_msg (cmd_ctx=0x7f76e4000d40, sock=0x7f76f2ffcba4, sock_error=0x7f76f2ffcb90) at client.c:2001
 #6  0x000000000047f00b in thread_manage_clients (data=0x1ad1a80) at client.c:2402
 #7  0x000000000047b303 in launch_thread (data=0x1ad1af0) at thread.c:66
 #8  0x00007f77052c5609 in start_thread (arg=<optimized out>) at pthread_create.c:477
 #9  0x00007f77051cc103 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 6 (Thread 0x7f7700fcf700 (LWP 240464)):
 #0  __lll_lock_wait (futex=futex@entry=0x1ad1308, private=0) at lowlevellock.c:52
 #1  0x00007f77052c80a3 in __GI___pthread_mutex_lock (mutex=0x1ad1308) at ../nptl/pthread_mutex_lock.c:80
 #2  0x0000000000461bf2 in run_command_no_wait (handle=0x1ad12f0, in_cmd=0x7f7700fce340) at notification-thread-commands.c:87
 #3  0x0000000000461b93 in notification_thread_client_communication_update (handle=0x1ad12f0, id=1, transmission_status=CLIENT_TRANSMISSION_STATUS_QUEUED) at notification-thread-commands.c:400
 #4  0x0000000000497658 in client_handle_transmission_status (client=0x7f76f8004e30, status=CLIENT_TRANSMISSION_STATUS_QUEUED, user_data=0x7f76f8004a00) at action-executor.c:154
 #5  0x0000000000467be7 in notification_client_list_send_evaluation (client_list=0x7f76f8004fe0, condition=0x7f76e40041a0, evaluation=0x7f76cc000cc0, trigger_creds=0x7f76e4004288, source_object_creds=0x0, client_report=0x4971a0 <client_ha
 ndle_transmission_status>, user_data=0x7f76f8004a00) at notification-thread-events.c:4007
 #6  0x00000000004956bb in action_executor_notify_handler (executor=0x7f76f8004a00, work_item=0x7f76f80062d0, action=0x7f76e4004210) at action-executor.c:199
 #7  0x00000000004953fd in action_executor_generic_handler (executor=0x7f76f8004a00, work_item=0x7f76f80062d0, action=0x7f76e4004210) at action-executor.c:493
 #8  0x0000000000495101 in action_work_item_execute (executor=0x7f76f8004a00, work_item=0x7f76f80062d0) at action-executor.c:506
 #9  0x0000000000493ff5 in action_executor_thread (_data=0x7f76f8004a00) at action-executor.c:559
 #10 0x000000000047b303 in launch_thread (data=0x7f76f8004aa0) at thread.c:66
 #11 0x00007f77052c5609 in start_thread (arg=<optimized out>) at pthread_create.c:477
 #12 0x00007f77051cc103 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 5 (Thread 0x7f77017d0700 (LWP 240463)):
 #0  __lll_lock_wait (futex=futex@entry=0x7f76f8004e30, private=0) at lowlevellock.c:52
 #1  0x00007f77052c80a3 in __GI___pthread_mutex_lock (mutex=0x7f76f8004e30) at ../nptl/pthread_mutex_lock.c:80
 #2  0x0000000000463080 in handle_notification_thread_command (handle=0x1ad12f0, state=0x7f77017cfb00) at notification-thread-events.c:2936
 #3  0x000000000045e881 in thread_notification (data=0x1ad12f0) at notification-thread.c:705
 #4  0x000000000047b303 in launch_thread (data=0x1ad1420) at thread.c:66
 #5  0x00007f77052c5609 in start_thread (arg=<optimized out>) at pthread_create.c:477
 #6  0x00007f77051cc103 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Cause
=====

The action executor holds the client lock across the communication to
prevent simultaneous update to the client state.

The notification thread holds the cmd_queue lock across operation for no
apparent reason (TODO make sure there is no internal add to the queue. if
so we should reacquire the lock only when necessery.)

Solution
========

Reduce the windows for which the cmd_queue lock is held by the
notification thread to only the "pop" action on the queue. As soon as we
have the lock, get the cmd, remove it from the list and release the
lock. This prevent inverted lock acquisition base on the pattern of the
action executor thread.

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I91d30c134bc1a128c96058f0e0cdd325808c91bc

4 years agosessiond: notification: add support for async commands
Jérémie Galarneau [Thu, 13 Feb 2020 21:50:52 +0000 (16:50 -0500)] 
sessiond: notification: add support for async commands

Notification thread commands are currently all blocking. This means
that the emitter of the command uses the notification_thread_command
APIs to (stack) allocate a command context and wait on an lttng_waiter
for a reply.

In preparation for the addition of non-blocking commands, a new type
of asynchroneous commands is introduced. Asynchroneous commands are
"fire-and-forget"; the caller has no way of obtaining the result of
the command.

Asynchroneous command contexts are heap-allocated and their ownership
is transferred to the notification thread. The notification thread
will not attempt to wake-up the emitter and will free() the command
context regardless of the command's result.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I6951987dda0262d1500a3b4e1403fb2559a3ff44

4 years agosessiond: notification: refactor: split transmission and poll update
Jérémie Galarneau [Thu, 13 Feb 2020 04:34:08 +0000 (23:34 -0500)] 
sessiond: notification: refactor: split transmission and poll update

Split the notification transmission logic and its effect on a
notification_client from the logic tied to the management of the
notification thread.

This is to make it possible to send (or queue) notifications from the
notification thread or another thread. If another thread encounters an
error or a full socket buffer, a future mechanism will allow it to
signal the notification thread to update its private state (e.g. poll
mask).

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I35d8943cb11473d82b07a4dbc5a0f093cde25a79

4 years agosessiond: notification: synchronize notification client (and list)
Jérémie Galarneau [Tue, 11 Feb 2020 04:06:19 +0000 (23:06 -0500)] 
sessiond: notification: synchronize notification client (and list)

Introduce reference counting to notification_client_list and add
locks to both notification_client and notification_client list.

Of important note, this is in preparation for the introduction of
an action executor thread. The aim of this change is not to make
any part of the notitication sub-system thread-safe in any "general"
sense.

The reference counting and locks are introduced to protect a very
specific usage scenario.

The main thread of the notification subsystem and the action executor
will interact through triggers, evaluations, and client lists.

If the action executor needs to send a notification to a list of
client during the execution of an action group, it obtains the client
list and acquires a reference to it. It then locks the list to iterate
on it, allowing it to send the notification(s) to all subscribed
clients.

Holding the list lock prevents the main thread from disconnecting and
subsequently destroying a client. Holding a reference to the list also
prevents the list from being reclaimed due to a concurent 'unregister
trigger' operation.

No provision for other access scenarios are taken into account.

Squashed fix, otherwise the tests would hang.
Fix: don't hold client lock while handling subscription changes

Holding the client's lock while handling subscription changes causes a
lock inversion between the client list lock and the client lock.

This happens when a client subscribes to a condition that evaluates to
'true' at the time of the subscription. When this happens, a
notification is sent right away and that communication will attempt to
acquire the client lock.

Holding the client lock for such a long period is not needed anyhow
and we can simply protect the communication state when it is actually
modified/used.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I7f38f81fa8bc32e5384538acdffab0824862cff2

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I88a9f66d31e3127b2bcc3f04016c364e0b3fe9ce

4 years agosessiond: notification: introduce the notion of 'active' client
Jérémie Galarneau [Mon, 10 Feb 2020 23:35:27 +0000 (18:35 -0500)] 
sessiond: notification: introduce the notion of 'active' client

Since notification_clients are now accessed from multiple threads, it
is possible for a thread to access a client while it is being
"cleaned-up" following an error.

The 'active' communication flag allows a check to be performed before
any communication is attempted with a client. The communication is
considered 'active' once the handshake has been performed. It is
considered 'inactive' if a fatal protocol error occurs.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: If16f83ef12fe9a6ef597ae464867011811389177

4 years agosessiond: notification: maintain an id to notification_client ht
Jérémie Galarneau [Sat, 8 Feb 2020 04:21:40 +0000 (23:21 -0500)] 
sessiond: notification: maintain an id to notification_client ht

In preparation for the addition of an action execution worker, add a
client_id_ht which will allow the action worker to send commands to
the notification thread referencing clients by 'id' rather than by
their socket.

This is done in order to prevent FD re-use races between the various
threads at play when a communication error occurs on a notification
client socket.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I7b8774c23a4a2a7c41ed8c1652f23d29954a0771

4 years agoFix: tests: `pgrep -f` flags unrelated process as lttng-sessiond
Jonathan Rajotte [Fri, 22 May 2020 14:36:46 +0000 (10:36 -0400)] 
Fix: tests: `pgrep -f` flags unrelated process as lttng-sessiond

Observed issue
==============

lttng-sessiond is not started by start_lttng_sessiond_opt and a vim
process is killed on stop_lttng_sessiond_opt.

Cause
=====
We uses "pgrep -f" with the interested pattern to gather the pids that
should be the lttng processes we are interested in. `pgrep -f` yields
false positives since it matches against the complete cmdline including
parent directory of the command and all arguments.

For example, the following will currently match for the sessiond
pattern:
 vim src/bin/lttng-sessiond/notification-thread-internal.h

This prevents the launch of sessiond by start_lttng_sessiond_opt and end
up killing the vim process on stop_lttng_sessiond_opt.

Solution
========
To alleviate this, we propose a two stage lookup. The first stage uses
"pgrep -f" yielding potential candidates. The second stage performs
grep on the basename of the first field of the /proc/[pid]/cmdline
for each pid candidates.

The first field of /proc/[pid]/cmdline corresponds to the actual command.
We use the basename to ensure that we do not match on the path to the
executable.

Known drawbacks
=========
None

References
==========

https://review.lttng.org/c/lttng-tools/+/3043

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I479ebad27f4965ae16d4442a6fe58ff3157d76fa

4 years agologging: print human-readable thread names when logging
Jérémie Galarneau [Tue, 11 Feb 2020 23:22:13 +0000 (18:22 -0500)] 
logging: print human-readable thread names when logging

The lttng_thread interface used by the session daemon uniquely names
all threads. This name can be used to augment the thread's logging
statement with a human-readable name rather than using the pid/tid
tuple used elsewhere.

Additionally, the thread name is set using the pthread API so that it
is visible in GDB and other tools (e.g. htop).

Invocations of pgrep in the test utilities are replaced by 'pgrep -f',
which matches against the process name.

We are not the first to encounter this problem after renaming the main
thread, see
https://github.com/mongodb/mongo/commit/726cafd713c7333640f8458ec9808ed4f678e3a7#diff-a9003101d1e4a99ac2d43d9b1b839587R122

pgrep uses the name name in /proc/$PID/status which contains the
thread name, not the executable name. In the case of the sessiond,
this is now "Main".

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I73dfe8683b2ea31f7ed0c2ffdfa8332f36e28f9b

4 years agosessiond: clarify the role of notification credentials
Jérémie Galarneau [Tue, 11 Aug 2020 19:58:08 +0000 (15:58 -0400)] 
sessiond: clarify the role of notification credentials

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ica7a7370fcc4d34c9af11bcb9e7435e19b29a6f8

4 years agoUse lttng_trigger credentials to send evaluation to client
Jonathan Rajotte [Tue, 24 Mar 2020 18:08:16 +0000 (14:08 -0400)] 
Use lttng_trigger credentials to send evaluation to client

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I49b4b2aeda09d09b7d8630562660dac96f36b3e7

4 years agotrigger: introduce refcounting
Jonathan Rajotte [Fri, 7 Feb 2020 22:40:46 +0000 (17:40 -0500)] 
trigger: introduce refcounting

Will be used for listing and much more use.

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ic9ee8753257fcbe15fc1321b18582628328c0154

4 years agotrigger: use condition and action ref counting to ease internal objects management
Jonathan Rajotte [Fri, 7 Aug 2020 20:22:36 +0000 (16:22 -0400)] 
trigger: use condition and action ref counting to ease internal objects management

Currently a trigger object has multiple relationship types with the action and
condition associated to it.

On the API client side, the trigger does not own the action and condition
associated to it. The user is responsible for managing the lifetime of the
associated action and condition object.

On the sessiond side, triggers created from `lttng_trigger_create_from_payload`
currently require that before calling lttng_trigger_destroy, the action and
condition be fetched and deleted using their respective destructor. This
operation cannot be done inside `lttng_trigger_destroy` since the exposed API is
clear that the trigger does no own the objects.

We can facilitate the lifetime/ownership management of the action and condition
objects using their respective reference counting mechanism.

On a trigger creation, the trigger get references on both object. On destroy,
the trigger put the references on the objects.

From an API client perspective, nothing changes. Even better, it prevents
premature freeing of these objects, since the trigger have references to these
objects.

On the sessiond side, we can now move the actual ownership of the action and
condition object to the trigger object. This is done in
`lttng_trigger_create_from_payload` forcing a put of the local references to the
object, effectively moving the ownership to the trigger object.

Note that the `lttng_trigger_get_{action, condition}` do not `get` a reference
to the object before returning it. This is done to comply with the API that was
introduced back in 2.11 which does expect a client to call lttng_{action,
condition}_destroy on the returned object.

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I31d196d00c886fcf42c1b16c55e693949373568b

4 years agocondition: introduce reference counting
Jonathan Rajotte [Fri, 7 Aug 2020 19:39:24 +0000 (15:39 -0400)] 
condition: introduce reference counting

This will allows easier management of the trigger ownership of its associated
condition and action objects.

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I46420b03bd4bf7948bc2d1f44985edfe86c27c61

4 years agoClean-up: tests: fd-tracker: change spaces to tabs
Jérémie Galarneau [Tue, 11 Aug 2020 16:22:19 +0000 (12:22 -0400)] 
Clean-up: tests: fd-tracker: change spaces to tabs

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I24fde39587495f5f48f900898755e57c10fbf303

4 years agoClean-up: relayd index: change spaces to tabs
Jérémie Galarneau [Tue, 11 Aug 2020 16:22:19 +0000 (12:22 -0400)] 
Clean-up: relayd index: change spaces to tabs

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I6a79fcdceebc0959bb66525daac288953e863bb3

4 years agoClean-up: sessiond comm relay: change spaces to tabs
Jérémie Galarneau [Tue, 11 Aug 2020 16:22:19 +0000 (12:22 -0400)] 
Clean-up: sessiond comm relay: change spaces to tabs

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I2c42c7b7574ccec85524a1b0aed33d108725ad4a

4 years agoClean-up: compat time: change spaces to tabs
Jérémie Galarneau [Tue, 11 Aug 2020 16:22:19 +0000 (12:22 -0400)] 
Clean-up: compat time: change spaces to tabs

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: If28840c388baffd257e61ef1ecad7c1ea3e3fd68

4 years agoClean-up: kernel consumer: change spaces to tabs
Jérémie Galarneau [Tue, 11 Aug 2020 16:22:19 +0000 (12:22 -0400)] 
Clean-up: kernel consumer: change spaces to tabs

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I55873fe5500e00d725ebaa241f4054a793b15aee

4 years agoClean-up: sessiond ust-app: change spaces to tabs
Jérémie Galarneau [Tue, 11 Aug 2020 16:22:19 +0000 (12:22 -0400)] 
Clean-up: sessiond ust-app: change spaces to tabs

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I4ba505a8206f4ee8838fc3268e3904c496eb3802

4 years agoClean-up: sessiond notification thread: change spaces to tabs
Jérémie Galarneau [Tue, 11 Aug 2020 16:22:19 +0000 (12:22 -0400)] 
Clean-up: sessiond notification thread: change spaces to tabs

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ieabe7e8a2066b30ef8b413896d03f8ff824ddc99

4 years agoClean-up: sessiond kernel: change spaces to tabs
Jérémie Galarneau [Tue, 11 Aug 2020 16:22:19 +0000 (12:22 -0400)] 
Clean-up: sessiond kernel: change spaces to tabs

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I3b8f476db2f9038fddf3b648dba0fa06f20db5d8

4 years agoClean-up: sessiond kernel: fix include style
Jérémie Galarneau [Tue, 11 Aug 2020 16:22:19 +0000 (12:22 -0400)] 
Clean-up: sessiond kernel: fix include style

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I8959283af0cafa94b0939e0025b08b40ff86653e

4 years agoClean-up: sessiond consumer: change space to tabs
Jérémie Galarneau [Tue, 11 Aug 2020 16:22:19 +0000 (12:22 -0400)] 
Clean-up: sessiond consumer: change space to tabs

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I367ffe89e3930b2e84ad8c70b95bd95a968d974c

4 years agoClean-up: sessiond: change space to tabs
Jérémie Galarneau [Tue, 11 Aug 2020 16:22:19 +0000 (12:22 -0400)] 
Clean-up: sessiond: change space to tabs

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Idb749331397c2b43ea41a33f09bf35d15c4b8f8a

4 years agoClean-up: sessiond manage-consumer: change space to tabs
Jérémie Galarneau [Tue, 11 Aug 2020 16:22:19 +0000 (12:22 -0400)] 
Clean-up: sessiond manage-consumer: change space to tabs

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ic897e04aa96bd58c6e8dd340f8537af86c2dd478

4 years agoClean-up: relayd: change space to tabs
Jérémie Galarneau [Tue, 11 Aug 2020 16:22:19 +0000 (12:22 -0400)] 
Clean-up: relayd: change space to tabs

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I4aa7d69ef823e4ea85820cadd23463869f98edf2

4 years agoClean-up: sessiond command: fix include style
Jérémie Galarneau [Tue, 11 Aug 2020 16:22:19 +0000 (12:22 -0400)] 
Clean-up: sessiond command: fix include style

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I2f3d695a97d31ca5b098872abc566a8b261cdf9c

4 years agoClean-up: sessiond command: change space to tabs
Jérémie Galarneau [Tue, 11 Aug 2020 16:22:19 +0000 (12:22 -0400)] 
Clean-up: sessiond command: change space to tabs

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ib809c788f3bf32360cd68b16972636db3ef4dfd7

This page took 0.058862 seconds and 4 git commands to generate.