lttng-tools.git
2 years agoAdd lttng::utils::time_to_iso8601_str
Jérémie Galarneau [Thu, 5 May 2022 19:15:44 +0000 (15:15 -0400)] 
Add lttng::utils::time_to_iso8601_str

lttng::utils::time_to_iso8601_str implements the same formatting
as time_to_iso8601_str, but returns an std::string.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I0bd7dbbdc2c3bae6fdef7917936450953af72175

2 years agoAdd vendor/fmt
Jérémie Galarneau [Tue, 3 May 2022 20:20:26 +0000 (16:20 -0400)] 
Add vendor/fmt

Add fmt 8.1.1 headers (we will use it in header-only mode). fmt is made
available under the MIT license, which is already in the LICENSES
directory.

Note that an lttng-format.hpp header is added to disable a warning which
prevents us from building with -Werror.

../../../src/vendor/fmt/format-inl.h:2457:11: error: target of initialization might be a candidate for a format attribute [-Werror=suggest-attribute=format]
 2457 |     int (*snprintf_ptr)(char*, size_t, const char*, ...) = FMT_SNPRINTF;
      |           ^~~~~~~~~~~~

The header also ensures that FMT_HEADER_ONLY is defined for all uses of
libfmt.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I5696c09d6e07716b955091922bb27ce082fb2686

2 years agosessiond: Move trace_ust_clock to a clock_attributes_sample class
Jérémie Galarneau [Mon, 2 May 2022 19:35:40 +0000 (15:35 -0400)] 
sessiond: Move trace_ust_clock to a clock_attributes_sample class

Move trace clock functions to a class that samples the clock's
attributes on creation. This makes it easier to implement trace format
agnostic serialization facilities in follow-up patches.

Change-Id: Id75b2c6e00779710e02691da107b2e93bf33ff12
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
2 years agoChange backing type of lttng_uuid to std::array
Jérémie Galarneau [Mon, 2 May 2022 19:33:09 +0000 (15:33 -0400)] 
Change backing type of lttng_uuid to std::array

Changing the backing type of lttng_uuid to std::array allows us to
return lttng_uuid from a function. This, in return, makes it possible to
initialize const attributes from the return value of a function
returning a UUID.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ie092eab4a848a41ddd9c63f779514f1e4ca2a441

2 years agosessiond: Split ust_registry_session into per-type classes
Jérémie Galarneau [Fri, 29 Apr 2022 02:06:25 +0000 (22:06 -0400)] 
sessiond: Split ust_registry_session into per-type classes

This is a preliminary refactoring step to implement support for the
conditional generation of CTF 1.8/2.0 stream description layouts.

Splitting the registry session will simplify the implementation of a
serialization visitor by segregating per-type environment attributes.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ia95dd0c67f2ff41ce4f771ce776ff84a214098b9

2 years agosessiond: Replace uses of session_trylock_list by a dedicated assert macro
Jérémie Galarneau [Thu, 5 May 2022 19:00:10 +0000 (15:00 -0400)] 
sessiond: Replace uses of session_trylock_list by a dedicated assert macro

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I925a2f4052149b3a9ff91a80c7541dc8ed226c70

2 years agoAdd basic exception types and throwing facilities
Jérémie Galarneau [Fri, 29 Apr 2022 19:43:14 +0000 (15:43 -0400)] 
Add basic exception types and throwing facilities

Add two LTTng-specific exception types:
  - lttng::ctl::error
  - lttng::posix_error

These types are meant to help transition from error code-based
error handling in RAII-safe functions.

lttng::ctl::error wraps `enum lttng_error_code`. It is meant to be
thrown using the `LTTNG_THROW_CTL` macro which samples the throw-site
(file name, function name, line number). This should be used only
in code paths dealing providing the liblttng-ctl interface.

It should, ultimately, be thrown in code that is specific to the
implementation of the various liblttng-ctl commands and not all over the
place since it contains very little information beyond the error code.

lttng::posix_error wraps `errno` values that are used in various places
to report errors involving (mostly) syscalls.

Over time, more specific exception types will be added.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I02e104f28dd8149aee70211b5849f3502f16d58b

2 years ago.clang-format: tweak C++ style
Jérémie Galarneau [Thu, 28 Apr 2022 23:18:12 +0000 (19:18 -0400)] 
.clang-format: tweak C++ style

Don't indent namespaces nor after access modifiers.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ifb67843c7c89d1f49dca9f59a76273f3b0b8fb3a

2 years agoAdd make_unique_wrapper()
Jérémie Galarneau [Thu, 28 Apr 2022 15:28:15 +0000 (11:28 -0400)] 
Add make_unique_wrapper()

make_unique_wrapper is intended to facilitate the use of std::unique_ptr
to wrap C-style APIs that don't provide RAII resource management facilities.

Usage example:

   // API
   struct my_c_struct {
           // ...
   };

   struct my_c_struct *create_my_c_struct(void);
   void destroy_my_c_struct(struct my_c_struct *value);

   // Creating a unique_ptr to my_c_struct.
   auto safe_c_struct =
           lttng::make_unique_wrapper<my_c_struct, destroy_my_c_struct>(
                   create_my_c_struct());

Note that this facility is intended for use in the scope of a function.
If you need to return this unique_ptr instance, you should consider writting
a proper, idiomatic, wrapper.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I429fc6f62896efb04af95fc26143096043206265

2 years agoAdd vendor/optional.hpp
Simon Marchi [Fri, 12 Nov 2021 15:09:35 +0000 (10:09 -0500)] 
Add vendor/optional.hpp

Taken from:

https://github.com/martinmoene/optional-lite/blob/a006f229a77b3b2dacf927e4029b8c1c60c86b52/include/nonstd/optional.hpp

The BSL-1.0 license is already in the LICENSES directory, so no need to
add it.

Change-Id: I47e9a3264b771b0a6aaefc022ada9e051b6b6d20
Signed-off-by: Simon Marchi <simon.marchi@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
2 years agoClean-up: ust-consumer: replace ad-hoc channel destruction
Jérémie Galarneau [Wed, 27 Apr 2022 22:08:48 +0000 (18:08 -0400)] 
Clean-up: ust-consumer: replace ad-hoc channel destruction

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I5018d841035eb302c0f3c092efc570b3eaa71198

2 years agoTests: test_session: include tap.h last
Jérémie Galarneau [Mon, 6 Jun 2022 16:07:15 +0000 (12:07 -0400)] 
Tests: test_session: include tap.h last

tap.h defines a number of macros that are very likely to clash with
other headers (e.g. ok, fail, etc.). On gcc 7.5.0, builds fail whenever
tap.h is included before an header that transitively includes
basic_ios.h.

This clash doesn't occur with more recent gcc releases (tested with 11.2
on my local machine).

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I62aaa14a6d1f20c77073ab2e481eddcb28fb78f3

2 years agoFix: lttng-snapshot: use after free of max size argument
Jérémie Galarneau [Tue, 17 May 2022 17:41:49 +0000 (13:41 -0400)] 
Fix: lttng-snapshot: use after free of max size argument

gcc 12.1.0 reports:

commands/snapshot.cpp: In function ‘int cmd_snapshot(int, const char**)’:
../../../src/common/error.hpp:139:32: error: pointer ‘max_size_arg’ may be used after ‘void free(void*)’ [-Werror=use-after-free]

free max_size_arg on both paths.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I3775e835e10b364f32f4797afb9c090ac4dc133c

2 years agoFix: test: lttng kernel modules still loaded after running test_clock_override
Jonathan Rajotte [Fri, 25 Mar 2022 18:26:38 +0000 (14:26 -0400)] 
Fix: test: lttng kernel modules still loaded after running test_clock_override

Observed issue
==============

After running test_clock_override, some lttng modules are still loaded.

$ lsmod | ag lttng
  lttng_test             32768  0
  lttng_tracer         2326528  1 lttng_test
  lttng_statedump       749568  1 lttng_tracer
  lttng_wrapper          16384  2 lttng_statedump,lttng_tracer
  lttng_uprobes          16384  1 lttng_tracer
  lttng_kprobes          16384  1 lttng_tracer
  lttng_lib_ring_buffer    61440  1 lttng_tracer
  lttng_kretprobes       16384  1 lttng_tracer
  lttng_clock_plugin_test    16384  1
  lttng_clock            16384  2 lttng_tracer,lttng_clock_plugin_test

Cause
=====

The order in which the modules are removed is important.

In `test_clock_override_timestamp` the last `modprobe --remove order` is

  modprobe --remove lttng-clock-plugin-test lttng-clock lttng-test

While other callsites order is:

  modprobe --remove lttng-test lttng-clock-plugin-test lttng-clock

Solution
========

Use

  modprobe --remove lttng-test lttng-clock-plugin-test lttng-clock

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I528df2e3e90664433337a547a74cdbe476d4ee62

2 years agoFix: lttng: snapshot: add-output: leak of max size parameter
Jérémie Galarneau [Fri, 15 Apr 2022 06:09:53 +0000 (02:09 -0400)] 
Fix: lttng: snapshot: add-output: leak of max size parameter

==1920281==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 6 byte(s) in 1 object(s) allocated from:
    #0 0x7fa95633add9 in __interceptor_malloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:145
    #1 0x7fa955e90c09  (/usr/lib/libpopt.so.0+0x3c09)

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I87ce90a77d9624add0cab5d3090a7e83734da7f4

2 years agoTests: fix: lttng-create: leaked command parameter
Jérémie Galarneau [Fri, 15 Apr 2022 05:55:45 +0000 (01:55 -0400)] 
Tests: fix: lttng-create: leaked command parameter

==1853705==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 3 byte(s) in 1 object(s) allocated from:
    #0 0x7fb67ee0edd9 in __interceptor_malloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:145
    #1 0x7fb67e964c09  (/usr/lib/libpopt.so.0+0x3c09)

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I2f3346157cb26de6712c6e6ebd5fafa6b51fac08

2 years agoFix: sessiond: rotation trigger leak
Jérémie Galarneau [Fri, 15 Apr 2022 05:30:50 +0000 (01:30 -0400)] 
Fix: sessiond: rotation trigger leak

==1801304==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 224 byte(s) in 2 object(s) allocated from:
    #0 0x7fe0f4e73fb9 in __interceptor_calloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x559fbeb64175 in zmalloc_internal ../../src/common/macros.hpp:60
    #2 0x559fbeb6a291 in lttng_trigger* zmalloc<lttng_trigger>() ../../src/common/macros.hpp:89
    #3 0x559fbeb64aa6 in lttng_trigger_create /home/jgalar/EfficiOS/src/lttng-tools/src/common/trigger.cpp:58
    #4 0x559fbe9dc417 in subscribe_session_consumed_size_rotation(ltt_session*, unsigned long, notification_thread_handle*) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/rotate.cpp:87
    #5 0x559fbe995d6f in cmd_rotation_set_schedule(ltt_session*, bool, lttng_rotation_schedule_type, unsigned long, notification_thread_handle*) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/cmd.cpp:5993
    #6 0x559fbe9fe559 in process_client_msg /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:2246
    #7 0x559fbea01378 in thread_manage_clients /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:2624
    #8 0x559fbe9ea642 in launch_thread /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/thread.cpp:68
    #9 0x7fe0f44935c1 in start_thread (/usr/lib/libc.so.6+0x8d5c1)

Indirect leak of 208 byte(s) in 2 object(s) allocated from:
    #0 0x7fe0f4e73fb9 in __interceptor_calloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x559fbeb16e21 in zmalloc_internal ../../src/common/macros.hpp:60
    #2 0x559fbeb16e31 in lttng_action_notify* zmalloc<lttng_action_notify>() ../../src/common/macros.hpp:89
    #3 0x559fbeb168a0 in lttng_action_notify_create actions/notify.cpp:135
    #4 0x559fbe9dc34b in subscribe_session_consumed_size_rotation(ltt_session*, unsigned long, notification_thread_handle*) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/rotate.cpp:80
    #5 0x559fbe995d6f in cmd_rotation_set_schedule(ltt_session*, bool, lttng_rotation_schedule_type, unsigned long, notification_thread_handle*) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/cmd.cpp:5993
    #6 0x559fbe9fe559 in process_client_msg /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:2246
    #7 0x559fbea01378 in thread_manage_clients /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:2624
    #8 0x559fbe9ea642 in launch_thread /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/thread.cpp:68
    #9 0x7fe0f44935c1 in start_thread (/usr/lib/libc.so.6+0x8d5c1)

Indirect leak of 160 byte(s) in 2 object(s) allocated from:
    #0 0x7fe0f4e73fb9 in __interceptor_calloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x559fbeb3d7a1 in zmalloc_internal ../../src/common/macros.hpp:60
    #2 0x559fbeb3fa35 in lttng_condition_session_consumed_size* zmalloc<lttng_condition_session_consumed_size>() ../../src/common/macros.hpp:89
    #3 0x559fbeb3e6fd in lttng_condition_session_consumed_size_create conditions/session-consumed-size.cpp:206
    #4 0x559fbe9dc0f1 in subscribe_session_consumed_size_rotation(ltt_session*, unsigned long, notification_thread_handle*) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/rotate.cpp:54
    #5 0x559fbe995d6f in cmd_rotation_set_schedule(ltt_session*, bool, lttng_rotation_schedule_type, unsigned long, notification_thread_handle*) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/cmd.cpp:5993
    #6 0x559fbe9fe559 in process_client_msg /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:2246
    #7 0x559fbea01378 in thread_manage_clients /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:2624
    #8 0x559fbe9ea642 in launch_thread /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/thread.cpp:68
    #9 0x7fe0f44935c1 in start_thread (/usr/lib/libc.so.6+0x8d5c1)

Indirect leak of 112 byte(s) in 2 object(s) allocated from:
    #0 0x7fe0f4e73fb9 in __interceptor_calloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x559fbeb242ad in zmalloc_internal ../../src/common/macros.hpp:60
    #2 0x559fbeb27062 in zmalloc<(anonymous namespace)::lttng_rate_policy_every_n> ../../src/common/macros.hpp:89
    #3 0x559fbeb25e9f in lttng_rate_policy_every_n_create actions/rate-policy.cpp:492
    #4 0x559fbeb168b9 in lttng_action_notify_create actions/notify.cpp:141
    #5 0x559fbe9dc34b in subscribe_session_consumed_size_rotation(ltt_session*, unsigned long, notification_thread_handle*) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/rotate.cpp:80
    #6 0x559fbe995d6f in cmd_rotation_set_schedule(ltt_session*, bool, lttng_rotation_schedule_type, unsigned long, notification_thread_handle*) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/cmd.cpp:5993
    #7 0x559fbe9fe559 in process_client_msg /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:2246
    #8 0x559fbea01378 in thread_manage_clients /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:2624
    #9 0x559fbe9ea642 in launch_thread /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/thread.cpp:68
    #10 0x7fe0f44935c1 in start_thread (/usr/lib/libc.so.6+0x8d5c1)

Indirect leak of 34 byte(s) in 2 object(s) allocated from:
    #0 0x7fe0f4e19319 in __interceptor_strdup /usr/src/debug/gcc/libsanitizer/asan/asan_interceptors.cpp:454
    #1 0x559fbeb3f603 in lttng_condition_session_consumed_size_set_session_name conditions/session-consumed-size.cpp:442
    #2 0x559fbe9dc2c4 in subscribe_session_consumed_size_rotation(ltt_session*, unsigned long, notification_thread_handle*) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/rotate.cpp:71
    #3 0x559fbe995d6f in cmd_rotation_set_schedule(ltt_session*, bool, lttng_rotation_schedule_type, unsigned long, notification_thread_handle*) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/cmd.cpp:5993
    #4 0x559fbe9fe559 in process_client_msg /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:2246
    #5 0x559fbea01378 in thread_manage_clients /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:2624
    #6 0x559fbe9ea642 in launch_thread /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/thread.cpp:68
    #7 0x7fe0f44935c1 in start_thread (/usr/lib/libc.so.6+0x8d5c1)

The rotation trigger of a session (used for size-based rotations) is
never cleaned-up. It is now cleaned up every time its condition is
hit and whenever the session is destroyed.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I5a89341535f87b7851b548ded9838c18bd1ccb95

2 years agoTests: fix: schedule api: leak of rotation schedule list
Jérémie Galarneau [Fri, 15 Apr 2022 05:34:54 +0000 (01:34 -0400)] 
Tests: fix: schedule api: leak of rotation schedule list

==1769573==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 24 byte(s) in 1 object(s) allocated from:
    #0 0x7fef37a29fb9 in __interceptor_calloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x7fef37792f2f in zmalloc_internal ../../../src/common/macros.hpp:60
    #2 0x7fef3779573a in lttng_rotation_schedules* zmalloc<lttng_rotation_schedules>() ../../../src/common/macros.hpp:89
    #3 0x7fef377947cc in lttng_rotation_schedules_create /home/jgalar/EfficiOS/src/lttng-tools/src/lib/lttng-ctl/rotate.cpp:353
    #4 0x7fef37794aa0 in get_schedules /home/jgalar/EfficiOS/src/lttng-tools/src/lib/lttng-ctl/rotate.cpp:392
    #5 0x7fef377956dc in lttng_session_list_rotation_schedules /home/jgalar/EfficiOS/src/lttng-tools/src/lib/lttng-ctl/rotate.cpp:665
    #6 0x5646131713f2 in test_add_list_remove_schedule /home/jgalar/EfficiOS/src/lttng-tools/tests/regression/tools/rotation/schedule_api.c:252
    #7 0x56461317157b in test_add_list_remove_size_schedule /home/jgalar/EfficiOS/src/lttng-tools/tests/regression/tools/rotation/schedule_api.c:270
    #8 0x564613171680 in main /home/jgalar/EfficiOS/src/lttng-tools/tests/regression/tools/rotation/schedule_api.c:307
    #9 0x7fef373ae30f in __libc_start_call_main (/usr/lib/libc.so.6+0x2d30f)

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I9b7eb537d158791db76f9a7676ffeb5d4a1f2203

2 years agoFix: lttng: enable-rotation: leak of command parameter
Jérémie Galarneau [Fri, 15 Apr 2022 05:29:46 +0000 (01:29 -0400)] 
Fix: lttng: enable-rotation: leak of command parameter

==1759491==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 6 byte(s) in 1 object(s) allocated from:
    #0 0x7fdbdc94add9 in __interceptor_malloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:145
    #1 0x7fdbdc4a0c09  (/usr/lib/libpopt.so.0+0x3c09)

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I29cc6ec4390e71829107f309f162247b9be2868c

2 years agoFix: lttng: track: leaked command parameter
Jérémie Galarneau [Fri, 15 Apr 2022 04:35:35 +0000 (00:35 -0400)] 
Fix: lttng: track: leaked command parameter

==1676099==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 5 byte(s) in 1 object(s) allocated from:
    #0 0x7f19429d9dd9 in __interceptor_malloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:145
    #1 0x7f19425342ad in poptGetNextOpt (/usr/lib/libpopt.so.0+0x82ad)

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ibafcaf42ad4f842b3fa74cf91dc5ecc8acb3487d

2 years agoFix: lttng: add-trigger: leak of parser context on capture
Jérémie Galarneau [Fri, 15 Apr 2022 03:43:10 +0000 (23:43 -0400)] 
Fix: lttng: add-trigger: leak of parser context on capture

==1501334==ERROR: LeakSanitizer: detected memory leaks

Indirect leak of 16386 byte(s) in 1 object(s) allocated from:
    #0 0x7f95efc3cdd9 in __interceptor_malloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:145
    #1 0x55acb0681ed3 in lttng_filter_yyalloc(unsigned long, void*) filter/filter-lexer.cpp:2511
    #2 0x55acb067f2f2 in lttng_filter_yy_create_buffer(_IO_FILE*, int, void*) filter/filter-lexer.cpp:1895
    #3 0x55acb067ea44 in yyrestart(_IO_FILE*, void*) filter/filter-lexer.cpp:1824
    #4 0x55acb0649a43 in filter_parser_ctx_alloc(_IO_FILE*) filter/filter-parser.ypp:271
    #5 0x55acb0649e7f in filter_parser_ctx_create_from_filter_expression(char const*, filter_parser_ctx**) filter/filter-parser.ypp:332
    #6 0x55acb058ee89 in parse_event_rule commands/add_trigger.cpp:783
    #7 0x55acb05920c0 in handle_condition_event commands/add_trigger.cpp:1361
    #8 0x55acb0592739 in parse_condition commands/add_trigger.cpp:1457
    #9 0x55acb0596b56 in cmd_add_trigger(int, char const**) commands/add_trigger.cpp:2304
    #10 0x55acb05a5b80 in handle_command /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng/lttng.cpp:238
    #11 0x55acb05a6643 in parse_args /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng/lttng.cpp:427
    #12 0x55acb05a694a in main /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng/lttng.cpp:476
    #13 0x7f95ef28730f in __libc_start_call_main (/usr/lib/libc.so.6+0x2d30f)

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I6fa21e7d066e0cf48afc3f91ceefbfd19c6b86fd

2 years agoTests: fix: leak of trigger in trigger listing tests
Jérémie Galarneau [Fri, 15 Apr 2022 03:26:12 +0000 (23:26 -0400)] 
Tests: fix: leak of trigger in trigger listing tests

==1480456==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 112 byte(s) in 1 object(s) allocated from:
    #0 0x7fdb9260cfb9 in __interceptor_calloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x7fdb9242348d in zmalloc_internal ../../src/common/macros.hpp:60
    #2 0x7fdb924295a9 in lttng_trigger* zmalloc<lttng_trigger>() ../../src/common/macros.hpp:89
    #3 0x7fdb92423dbe in lttng_trigger_create /home/jgalar/EfficiOS/src/lttng-tools/src/common/trigger.cpp:58
    #4 0x56304832331f in register_trigger /home/jgalar/EfficiOS/src/lttng-tools/tests/regression/tools/trigger/utils/register-some-triggers.cpp:24
    #5 0x5630483233f1 in register_trigger_action_list_notify /home/jgalar/EfficiOS/src/lttng-tools/tests/regression/tools/trigger/utils/register-some-triggers.cpp:46
    #6 0x5630483239a0 in test_session_rotation_conditions /home/jgalar/EfficiOS/src/lttng-tools/tests/regression/tools/trigger/utils/register-some-triggers.cpp:246
    #7 0x563048323d4d in main /home/jgalar/EfficiOS/src/lttng-tools/tests/regression/tools/trigger/utils/register-some-triggers.cpp:309
    #8 0x7fdb91c6630f in __libc_start_call_main (/usr/lib/libc.so.6+0x2d30f)

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ie163989a70f65f9c2c4e93c36cc9fc6ba6bdeeb5

2 years agoFix: action error query: leak of action path
Jérémie Galarneau [Fri, 15 Apr 2022 03:21:27 +0000 (23:21 -0400)] 
Fix: action error query: leak of action path

==1429021==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 8 byte(s) in 1 object(s) allocated from:
    #0 0x7fe305f031b2 in __interceptor_realloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:164
    #1 0x559f1b022238 in lttng_dynamic_buffer_set_capacity(lttng_dynamic_buffer*, unsigned long) /home/jgalar/EfficiOS/src/lttng-tools/src/common/dynamic-buffer.cpp:159
    #2 0x559f1b021d9f in lttng_dynamic_buffer_append(lttng_dynamic_buffer*, void const*, unsigned long) /home/jgalar/EfficiOS/src/lttng-tools/src/common/dynamic-buffer.cpp:52
    #3 0x559f1b02144a in lttng_dynamic_array_add_element(lttng_dynamic_array*, void const*) /home/jgalar/EfficiOS/src/lttng-tools/src/common/dynamic-array.cpp:58
    #4 0x559f1b07d07b in lttng_action_path_copy(lttng_action_path const*, lttng_action_path*) actions/path.cpp:116
    #5 0x559f1b02383f in lttng_error_query_action_create /home/jgalar/EfficiOS/src/lttng-tools/src/common/error-query.cpp:232
    #6 0x559f1b02760e in lttng_error_query_create_from_payload(lttng_payload_view*, lttng_error_query**) /home/jgalar/EfficiOS/src/lttng-tools/src/common/error-query.cpp:911
    #7 0x559f1af5c361 in receive_lttng_error_query /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:740
    #8 0x559f1af64eba in process_client_msg /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:2336
    #9 0x559f1af67378 in thread_manage_clients /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:2624
    #10 0x559f1af50642 in launch_thread /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/thread.cpp:68
    #11 0x7fe3055225c1 in start_thread (/usr/lib/libc.so.6+0x8d5c1)

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I7a6f7d2a9746124581eebf30877466f16db67a6b

2 years agoFix: lttng: enable-channel: leak of popt arguments
Jérémie Galarneau [Fri, 15 Apr 2022 00:22:03 +0000 (20:22 -0400)] 
Fix: lttng: enable-channel: leak of popt arguments

==1245463==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 5 byte(s) in 1 object(s) allocated from:
    #0 0x7fe7c494fdd9 in __interceptor_malloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:145
    #1 0x7fe7c44a5c09  (/usr/lib/libpopt.so.0+0x3c09)

Arguments obtained with poptGetOptArg() must be free'd.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I5a65ca6fbaa18f7717ea918a5bc7f42daeb1009a

2 years agoTests: clean-up: rate policy: remove stale comment
Jérémie Galarneau [Fri, 15 Apr 2022 00:09:58 +0000 (20:09 -0400)] 
Tests: clean-up: rate policy: remove stale comment

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Idd030c765b0a4afa2d13ff015a17bd52493204a6

2 years agoTests: fix: leak of rate policy in rate policy unit tests
Jérémie Galarneau [Fri, 15 Apr 2022 00:09:24 +0000 (20:09 -0400)] 
Tests: fix: leak of rate policy in rate policy unit tests

==1198508==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 56 byte(s) in 1 object(s) allocated from:
    #0 0x7f8b62634fb9 in __interceptor_calloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x557871869adb in zmalloc_internal ../../src/common/macros.hpp:60
    #2 0x55787186c8a0 in zmalloc<(anonymous namespace)::lttng_rate_policy_once_after_n> ../../src/common/macros.hpp:89
    #3 0x55787186c173 in lttng_rate_policy_once_after_n_create actions/rate-policy.cpp:707
    #4 0x55787186a368 in lttng_rate_policy_once_after_n_create_from_payload actions/rate-policy.cpp:183
    #5 0x55787186ad02 in lttng_rate_policy_create_from_payload(lttng_payload_view*, lttng_rate_policy**) actions/rate-policy.cpp:287
    #6 0x557871865b5b in test_rate_policy_once_after_n /home/jgalar/EfficiOS/src/lttng-tools/tests/unit/test_rate_policy.cpp:231
    #7 0x557871865dc9 in main /home/jgalar/EfficiOS/src/lttng-tools/tests/unit/test_rate_policy.cpp:250
    #8 0x7f8b61c7130f in __libc_start_call_main (/usr/lib/libc.so.6+0x2d30f)

Direct leak of 56 byte(s) in 1 object(s) allocated from:
    #0 0x7f8b62634fb9 in __interceptor_calloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x557871869adb in zmalloc_internal ../../src/common/macros.hpp:60
    #2 0x55787186c890 in zmalloc<(anonymous namespace)::lttng_rate_policy_every_n> ../../src/common/macros.hpp:89
    #3 0x55787186b6cd in lttng_rate_policy_every_n_create actions/rate-policy.cpp:492
    #4 0x55787186a699 in lttng_rate_policy_every_n_create_from_payload actions/rate-policy.cpp:220
    #5 0x55787186ad02 in lttng_rate_policy_create_from_payload(lttng_payload_view*, lttng_rate_policy**) actions/rate-policy.cpp:287
    #6 0x557871864cae in test_rate_policy_every_n /home/jgalar/EfficiOS/src/lttng-tools/tests/unit/test_rate_policy.cpp:122
    #7 0x557871865dc4 in main /home/jgalar/EfficiOS/src/lttng-tools/tests/unit/test_rate_policy.cpp:249
    #8 0x7f8b61c7130f in __libc_start_call_main (/usr/lib/libc.so.6+0x2d30f)

SUMMARY: AddressSanitizer: 112 byte(s) leaked in 2 allocation(s).

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I3a9b4d99e93f355ddb8623a289f8397907486ab0

2 years agoTests: fix: leak of payload in serdes test of log level rule
Jérémie Galarneau [Fri, 15 Apr 2022 00:06:19 +0000 (20:06 -0400)] 
Tests: fix: leak of payload in serdes test of log level rule

==1190137==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 8 byte(s) in 1 object(s) allocated from:
    #0 0x7f40a9d4c1b2 in __interceptor_realloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:164
    #1 0x55ab716e1def in lttng_dynamic_buffer_set_capacity(lttng_dynamic_buffer*, unsigned long) /home/jgalar/EfficiOS/src/lttng-tools/src/common/dynamic-buffer.cpp:159
    #2 0x55ab716e1956 in lttng_dynamic_buffer_append(lttng_dynamic_buffer*, void const*, unsigned long) /home/jgalar/EfficiOS/src/lttng-tools/src/common/dynamic-buffer.cpp:52
    #3 0x55ab716ca64e in lttng_log_level_rule_serialize(lttng_log_level_rule const*, lttng_payload*) /home/jgalar/EfficiOS/src/lttng-tools/src/common/log-level-rule.cpp:177
    #4 0x55ab716c760f in test_log_level_rule_serialize_deserialize /home/jgalar/EfficiOS/src/lttng-tools/tests/unit/test_log_level_rule.cpp:60
    #5 0x55ab716c8457 in test_log_level_rule_at_least_as_severe_as /home/jgalar/EfficiOS/src/lttng-tools/tests/unit/test_log_level_rule.cpp:177
    #6 0x55ab716c84d3 in main /home/jgalar/EfficiOS/src/lttng-tools/tests/unit/test_log_level_rule.cpp:185
    #7 0x7f40a938830f in __libc_start_call_main (/usr/lib/libc.so.6+0x2d30f)

Direct leak of 8 byte(s) in 1 object(s) allocated from:
    #0 0x7f40a9d4c1b2 in __interceptor_realloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:164
    #1 0x55ab716e1def in lttng_dynamic_buffer_set_capacity(lttng_dynamic_buffer*, unsigned long) /home/jgalar/EfficiOS/src/lttng-tools/src/common/dynamic-buffer.cpp:159
    #2 0x55ab716e1956 in lttng_dynamic_buffer_append(lttng_dynamic_buffer*, void const*, unsigned long) /home/jgalar/EfficiOS/src/lttng-tools/src/common/dynamic-buffer.cpp:52
    #3 0x55ab716ca64e in lttng_log_level_rule_serialize(lttng_log_level_rule const*, lttng_payload*) /home/jgalar/EfficiOS/src/lttng-tools/src/common/log-level-rule.cpp:177
    #4 0x55ab716c760f in test_log_level_rule_serialize_deserialize /home/jgalar/EfficiOS/src/lttng-tools/tests/unit/test_log_level_rule.cpp:60
    #5 0x55ab716c8135 in test_log_level_rule_exactly /home/jgalar/EfficiOS/src/lttng-tools/tests/unit/test_log_level_rule.cpp:154
    #6 0x55ab716c84ce in main /home/jgalar/EfficiOS/src/lttng-tools/tests/unit/test_log_level_rule.cpp:184
    #7 0x7f40a938830f in __libc_start_call_main (/usr/lib/libc.so.6+0x2d30f)

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I2d1eafabbd5c101c188bad8a2137615b29c0ef68

2 years agoTests: fix: leak of some attributes of ltt_ust_session
Jérémie Galarneau [Fri, 15 Apr 2022 00:02:18 +0000 (20:02 -0400)] 
Tests: fix: leak of some attributes of ltt_ust_session

==1175545==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 8696 byte(s) in 1 object(s) allocated from:
    #0 0x7efed0f39fb9 in __interceptor_calloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x55707ddc6004 in zmalloc_internal ../../../src/common/macros.hpp:60
    #2 0x55707ddceb17 in ltt_ust_session* zmalloc<ltt_ust_session>() ../../../src/common/macros.hpp:89
    #3 0x55707ddc81e7 in trace_ust_create_session(unsigned long) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/trace-ust.cpp:274
    #4 0x55707ddc2bea in test_create_one_ust_session /home/jgalar/EfficiOS/src/lttng-tools/tests/unit/test_ust_data.cpp:63
    #5 0x55707ddc4941 in main /home/jgalar/EfficiOS/src/lttng-tools/tests/unit/test_ust_data.cpp:283
    #6 0x7efed04f930f in __libc_start_call_main (/usr/lib/libc.so.6+0x2d30f)

Indirect leak of 24672 byte(s) in 1 object(s) allocated from:
    #0 0x7efed0f39fb9 in __interceptor_calloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x55707dee4ec1 in zmalloc_internal ../../../src/common/macros.hpp:60
    #2 0x55707def774e in consumer_output* zmalloc<consumer_output>() ../../../src/common/macros.hpp:89
    #3 0x55707dee90df in consumer_create_output(consumer_dst_type) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/consumer.cpp:523
    #4 0x55707ddc8821 in trace_ust_create_session(unsigned long) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/trace-ust.cpp:321
    #5 0x55707ddc2bea in test_create_one_ust_session /home/jgalar/EfficiOS/src/lttng-tools/tests/unit/test_ust_data.cpp:63
    #6 0x55707ddc4941 in main /home/jgalar/EfficiOS/src/lttng-tools/tests/unit/test_ust_data.cpp:283
    #7 0x7efed04f930f in __libc_start_call_main (/usr/lib/libc.so.6+0x2d30f)

Indirect leak of 1024 byte(s) in 1 object(s) allocated from:
    #0 0x7efed0f39fb9 in __interceptor_calloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x7efed0bf985f in alloc_split_items_count /home/jgalar/EfficiOS/src/userspace-rcu/src/rculfhash.c:688
    #2 0x7efed0bf985f in _cds_lfht_new /home/jgalar/EfficiOS/src/userspace-rcu/src/rculfhash.c:1642

Indirect leak of 656 byte(s) in 1 object(s) allocated from:
    #0 0x7efed0f39fb9 in __interceptor_calloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x7efed0bfac68 in __default_alloc_cds_lfht ../src/rculfhash-internal.h:172
    #2 0x7efed0bfac68 in alloc_cds_lfht /home/jgalar/EfficiOS/src/userspace-rcu/src/rculfhash-mm-order.c:81

Indirect leak of 48 byte(s) in 2 object(s) allocated from:
    #0 0x7efed0f39fb9 in __interceptor_calloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x7efed0bfabd4 in cds_lfht_alloc_bucket_table /home/jgalar/EfficiOS/src/userspace-rcu/src/rculfhash-mm-order.c:35
    #2 0x7efed0bfabd4 in cds_lfht_alloc_bucket_table /home/jgalar/EfficiOS/src/userspace-rcu/src/rculfhash-mm-order.c:28

Indirect leak of 24 byte(s) in 1 object(s) allocated from:
    #0 0x7efed0f39fb9 in __interceptor_calloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x55707de3a9af in zmalloc_internal ../../src/common/macros.hpp:60
    #2 0x55707de3a9bf in lttng_ht* zmalloc<lttng_ht>() ../../src/common/macros.hpp:89
    #3 0x55707de38461 in lttng_ht_new(unsigned long, lttng_ht_type) hashtable/hashtable.cpp:113
    #4 0x55707dee9340 in consumer_create_output(consumer_dst_type) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/consumer.cpp:535
    #5 0x55707ddc8821 in trace_ust_create_session(unsigned long) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/trace-ust.cpp:321
    #6 0x55707ddc2bea in test_create_one_ust_session /home/jgalar/EfficiOS/src/lttng-tools/tests/unit/test_ust_data.cpp:63
    #7 0x55707ddc4941 in main /home/jgalar/EfficiOS/src/lttng-tools/tests/unit/test_ust_data.cpp:283
    #8 0x7efed04f930f in __libc_start_call_main (/usr/lib/libc.so.6+0x2d30f)

Indirect leak of 16 byte(s) in 1 object(s) allocated from:
    #0 0x7efed0f39fb9 in __interceptor_calloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x7efed0bfac15 in cds_lfht_alloc_bucket_table /home/jgalar/EfficiOS/src/userspace-rcu/src/rculfhash-mm-order.c:31

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ib2ad82a197f2a4ccb86ae5799c1d93ff059888e3

2 years agoFix: liblttng-ctl: leak of payload on field listing
Jérémie Galarneau [Thu, 14 Apr 2022 23:45:28 +0000 (19:45 -0400)] 
Fix: liblttng-ctl: leak of payload on field listing

LeakSanitizer reports the following leak:

==974957==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 32 byte(s) in 1 object(s) allocated from:
    #0 0x7fdb86fcd1b2 in __interceptor_realloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:164
    #1 0x7fdb86d7c296 in lttng_dynamic_buffer_set_capacity(lttng_dynamic_buffer*, unsigned long) /home/jgalar/EfficiOS/src/lttng-tools/src/common/dynamic-buffer.cpp:159
    #2 0x7fdb86d7c060 in lttng_dynamic_buffer_set_size(lttng_dynamic_buffer*, unsigned long) /home/jgalar/EfficiOS/src/lttng-tools/src/common/dynamic-buffer.cpp:112
    #3 0x7fdb86d2589a in recv_payload_sessiond /home/jgalar/EfficiOS/src/lttng-tools/src/lib/lttng-ctl/lttng-ctl.cpp:230
    #4 0x7fdb86d26fa5 in lttng_ctl_ask_sessiond_payload(lttng_payload_view*, lttng_payload*) /home/jgalar/EfficiOS/src/lttng-tools/src/lib/lttng-ctl/lttng-ctl.cpp:662
    #5 0x7fdb86d2cd8d in lttng_list_tracepoint_fields /home/jgalar/EfficiOS/src/lttng-tools/src/lib/lttng-ctl/lttng-ctl.cpp:1767
    #6 0x56481623cb4c in list_ust_event_fields commands/list.cpp:850
    #7 0x5648162448d9 in cmd_list(int, char const**) commands/list.cpp:2394
    #8 0x56481628fb3e in handle_command /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng/lttng.cpp:238
    #9 0x564816290601 in parse_args /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng/lttng.cpp:427
    #10 0x564816290908 in main /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng/lttng.cpp:476
    #11 0x7fdb8661730f in __libc_start_call_main (/usr/lib/libc.so.6+0x2d30f)

SUMMARY: AddressSanitizer: 32 byte(s) leaked in 1 allocation(s).

The session daemon's reply is indeed never released in
lttng_list_tracepoint_fields.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Idd244b52a69f3b74e5c131c1c36c6ee6d76f4285

2 years agoFix: sessiond: ODR violation results in memory corruption
Jérémie Galarneau [Thu, 14 Apr 2022 23:01:25 +0000 (19:01 -0400)] 
Fix: sessiond: ODR violation results in memory corruption

Issue observed
==============

Address sanitizer reports the following invalid accesses while running
the test_mi test.

❯ ASAN_OPTIONS=detect_odr_violation=0 lttng-sessiond
=================================================================
==289173==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x60400000e280 at pc 0x55cbbe35e2e0 bp 0x7f01672f1550 sp 0x7f01672f1540
WRITE of size 4 at 0x60400000e280 thread T13
    #0 0x55cbbe35e2df in mark_thread_as_ready /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/manage-consumer.cpp:32
    #1 0x55cbbe360160 in thread_consumer_management /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/manage-consumer.cpp:267
    #2 0x55cbbe336ac4 in launch_thread /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/thread.cpp:66
    #3 0x7f01729c15c1 in start_thread (/usr/lib/libc.so.6+0x8d5c1)
    #4 0x7f0172a46583 in __clone (/usr/lib/libc.so.6+0x112583)

0x60400000e280 is located 8 bytes to the right of 40-byte region [0x60400000e250,0x60400000e278)
allocated by thread T7 here:
    #0 0x7f01733b1fb9 in __interceptor_calloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x55cbbe33adf3 in zmalloc_internal ../../../src/common/macros.hpp:60
    #2 0x55cbbe33ae03 in thread_notifiers* zmalloc<thread_notifiers>() ../../../src/common/macros.hpp:89
    #3 0x55cbbe3617f9 in launch_consumer_management_thread(consumer_data*) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/manage-consumer.cpp:440
    #4 0x55cbbe33cf49 in spawn_consumer_thread /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:188
    #5 0x55cbbe33f7cf in start_consumerd /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:394
    #6 0x55cbbe345713 in process_client_msg /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:1277
    #7 0x55cbbe34d74b in thread_manage_clients /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:2622
    #8 0x55cbbe336ac4 in launch_thread /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/thread.cpp:66
    #9 0x7f01729c15c1 in start_thread (/usr/lib/libc.so.6+0x8d5c1)

Thread T13 created by T7 here:
    #0 0x7f0173353eb7 in __interceptor_pthread_create /usr/src/debug/gcc/libsanitizer/asan/asan_interceptors.cpp:216
    #1 0x55cbbe336f9e in lttng_thread_create(char const*, void* (*)(void*), bool (*)(void*), void (*)(void*), void*) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/thread.cpp:106
    #2 0x55cbbe3618cc in launch_consumer_management_thread(consumer_data*) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/manage-consumer.cpp:453
    #3 0x55cbbe33cf49 in spawn_consumer_thread /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:188
    #4 0x55cbbe33f7cf in start_consumerd /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:394
    #5 0x55cbbe345713 in process_client_msg /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:1277
    #6 0x55cbbe34d74b in thread_manage_clients /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:2622
    #7 0x55cbbe336ac4 in launch_thread /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/thread.cpp:66
    #8 0x7f01729c15c1 in start_thread (/usr/lib/libc.so.6+0x8d5c1)

Thread T7 created by T0 here:
    #0 0x7f0173353eb7 in __interceptor_pthread_create /usr/src/debug/gcc/libsanitizer/asan/asan_interceptors.cpp:216
    #1 0x55cbbe336f9e in lttng_thread_create(char const*, void* (*)(void*), bool (*)(void*), void (*)(void*), void*) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/thread.cpp:106
    #2 0x55cbbe34eebf in launch_client_thread() /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:2756
    #3 0x55cbbe27f31a in main /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/main.cpp:1838
    #4 0x7f017296130f in __libc_start_call_main (/usr/lib/libc.so.6+0x2d30f)

SUMMARY: AddressSanitizer: heap-buffer-overflow /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/manage-consumer.cpp:32 in mark_thread_as_ready
Shadow bytes around the buggy address:
  0x0c087fff9c00: fa fa fd fd fd fd fd fa fa fa fd fd fd fd fd fa
  0x0c087fff9c10: fa fa fd fd fd fd fd fa fa fa fd fd fd fd fd fa
  0x0c087fff9c20: fa fa fd fd fd fd fd fa fa fa fd fd fd fd fd fa
  0x0c087fff9c30: fa fa fd fd fd fd fd fa fa fa fd fd fd fd fd fa
  0x0c087fff9c40: fa fa fd fd fd fd fd fa fa fa 00 00 00 00 00 fa
=>0x0c087fff9c50:[fa]fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c087fff9c60: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c087fff9c70: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c087fff9c80: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c087fff9c90: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c087fff9ca0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
  Shadow gap:              cc
==289173==ABORTING

Cause
=====

The start functions of the various worker threads of the session daemon
are implemented in separate translation units (TU). To make use of the
lttng_thread API, they all define different control structures to
control their shutdown.

Those structures are all named 'thread_notifiers' and are all allocated
using zmalloc<>. The various instances of zmalloc<thread_notifiers> all
end up having the same mangled name (e.g.
_Z7zmallocI16thread_notifiersEPT_v).

At link time, only one instance of zmalloc<thread_notifiers> is kept.
Since those structures all have different layout/sizes, this is
problematic. However, it is an acceptable behaviour according to the ODR
[1].

I first considered making the various memory allocation functions in
macros.hpp 'static' which results in each TU holding the appropriate
specialization of the various functions. While this works, it doesn't
make us ODR-compliant. To make a long story short, a program defining
multiple types sharing the same name, in the same namespace, is
ill-formed.

Another concern is that marking all templated free-functions as static
will eventually result in code bloat.

Solution
========

All structures defined in TUs (but not in a header) are placed in
unnamed namespaces (also called anonymous namespaces) [2].

This results in separate copies of the templated functions being
generated when specialized using a structure in an anonymous
namespace (e.g. _Z7zmallocIN12_GLOBAL__N_116thread_notifiersEEPT_v).

We could have renamed the various `thread_notifiers` structures to give
them different names. However, I found those are not the only structures
sharing a name in different TUs. For instance, the same problem applies
to `struct lttng_index` (index in a stream, index in a map).

I propose we systematically namespace structures defined in TUs in the
future.

This will also save us trouble if those POD structures eventually become
non-POD: we would experience the same "clashes" if those structures had
constructors, for example.

References
==========

[1] https://en.cppreference.com/w/cpp/language/definition
[2] https://en.cppreference.com/w/cpp/language/namespace

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I867e5a287ad8cf3ada617335bc1a80b800bf0833

2 years agoFix: liblttng-ctl: non-packed structure used for tracker serialization
Jérémie Galarneau [Thu, 14 Apr 2022 21:36:54 +0000 (17:36 -0400)] 
Fix: liblttng-ctl: non-packed structure used for tracker serialization

Using unpacked structures in liblttng-ctl's protocol can cause issues
when mixing sessiond and client of different bitness. In this specific
case I doubt it causes a problem, but it could rightfully do on some
architectures.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ie17096a55a4b7508c604e184cae877b83df6e451

2 years agoFix: sessiond: assert on empty payload when handling client out event
Jérémie Galarneau [Fri, 8 Apr 2022 23:34:04 +0000 (19:34 -0400)] 
Fix: sessiond: assert on empty payload when handling client out event

Observed issue
==============

When servicing a large number of tracer notifications and sending
notifications to clients, the session daemon occasionally hits
an assertion:

  #4  0x00007fb224d7d116 in __assert_fail () from /usr/lib/libc.so.6
  #5  0x000056038b2fe4d7 in client_flush_outgoing_queue (client=0x7fb21400c3b0) at notification-thread-events.cpp:3586
  #6  0x000056038b2ff819 in handle_notification_thread_client_out (state=0x7fb221974090, socket=77) at notification-thread-events.cpp:4104
  #7  0x000056038b2f3d77 in thread_notification (data=0x56038cc7fe90) at notification-thread.cpp:763
  #8  0x000056038b30ca7d in launch_thread (data=0x56038cc7e220) at thread.cpp:66
  #9  0x00007fb224dcf5c2 in start_thread () from /usr/lib/libc.so.6
  #10 0x00007fb224e54584 in clone () from /usr/lib/libc.so.6

Cause
=====

A client "out" event can be received when no payload is left
to send under some circumstances.

Many threads can flush a client's outgoing queue and, if they
had to queue their message (socket was full), will use the
"communication update" command to signal the (e)poll thread
to monitor for space being made available in the socket.

Commands are sent over an internal pipe serviced by the same
thread as the client sockets.

When space is made available in the socket, there is a race
between the (e)poll thread and the other threads that may
wish to use the client's socket to flush its outgoing queue.

A non-(e)poll thread may attempt (and succeed) in flushing
the queue before the (e)poll thread gets a chance to service
the client's "out" event.

In this situation, the (e)poll thread processing the client
out event will see an empty payload: there is nothing to do.

Solution
========

The (e)poll thread can simply ignore the "client out" event
when an empty payload is seen.

There is also no need to update the transmission status as
the other thread has already enqueued a "communication
update" command to do so.

Known drawbacks
===============

None.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I8a181bea1e37e8e14cc67b624b76d139b488eded

2 years agoBump minimum kernel version to 2.6.30 to use EFD_SEMAPHORE
Jonathan Rajotte [Wed, 6 Apr 2022 19:32:44 +0000 (15:32 -0400)] 
Bump minimum kernel version to 2.6.30 to use EFD_SEMAPHORE

The bump in the kernel version allows the use of EFD_SEMAPHORE for
eventfd.

Adjust the README.md to reflect this. No need to provide direct
instruction for older kernel. We leave the '--disable-epoll' switch
available and the code behind it simply because other platform might not
have epoll available.

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Id512f018c5394c9cc699e19c3d5a0d753e56414b

2 years agoFix: Revert of 814b4934e2604a419bcb8eec57c0450dbb47e2c3
Jonathan Rajotte [Wed, 6 Apr 2022 13:17:38 +0000 (09:17 -0400)] 
Fix: Revert of 814b4934e2604a419bcb8eec57c0450dbb47e2c3

Observed issue
==============

During high throughput event notification generation scenarios the
following deadlock happens:

 Thread 14 (Thread 0x7f74b4ff9700 (LWP 76062)):
 #0  __lll_lock_wait (futex=futex@entry=0x56408765dde8, private=0) at lowlevellock.c:52
 #1  0x00007f74c941a0a3 in __GI___pthread_mutex_lock (mutex=0x56408765dde8) at ../nptl/pthread_mutex_lock.c:80
 #2  0x000056408704b207 in run_command_wait (handle=0x56408765ddd0, cmd=0x7f74b4ff7f70) at notification-thread-commands.cpp:31
 #3  0x000056408704bcef in notification_thread_command_remove_tracer_event_source (handle=0x56408765ddd0, tracer_event_source_fd=54) at notification-thread-commands.cpp:319
 #4  0x000056408708a0c1 in delete_ust_app (app=0x7f749c000bf0) at ust-app.cpp:1059
 #5  0x000056408708a511 in delete_ust_app_rcu (head=0x7f749c000ca0) at ust-app.cpp:1122
 #6  0x00007f74c988b4a7 in call_rcu_thread (arg=0x7f74b8004a80) at ../src/urcu-call-rcu-impl.h:369
 #7  0x00007f74c9417609 in start_thread (arg=<optimized out>) at pthread_create.c:477
 #8  0x00007f74c933a163 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

 Thread 13 (Thread 0x7f74b57fa700 (LWP 76047)):
 #0  0x00007f74c933a49e in epoll_wait (epfd=48, events=0x7f74a4000b60, maxevents=2, timeout=-1) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
 #1  0x00005640870eafa6 in compat_epoll_wait (events=0x7f74b57f9240, timeout=-1, interruptible=false) at compat/poll.cpp:280
 #2  0x00005640870abb65 in thread_agent_management (data=0x56408765f0b0) at agent-thread.cpp:424
 #3  0x0000564087062b1a in launch_thread (data=0x56408765f150) at thread.cpp:66
 #4  0x00007f74c9417609 in start_thread (arg=<optimized out>) at pthread_create.c:477
 #5  0x00007f74c933a163 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

 Thread 12 (Thread 0x7f74b5ffb700 (LWP 76046)):                                                                                                                                       [630/709]
 #0  0x00007f74c933a49e in epoll_wait (epfd=47, events=0x7f74a0000b60, maxevents=2, timeout=-1) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
 #1  0x00005640870eafa6 in compat_epoll_wait (events=0x7f74b5ffa170, timeout=-1, interruptible=false) at compat/poll.cpp:280
 #2  0x00005640870a4095 in thread_application_notification (data=0x56408765ee40) at notify-apps.cpp:78
 #3  0x0000564087062b1a in launch_thread (data=0x56408765eed0) at thread.cpp:66
 #4  0x00007f74c9417609 in start_thread (arg=<optimized out>) at pthread_create.c:477
 #5  0x00007f74c933a163 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

 Thread 11 (Thread 0x7f74b67fc700 (LWP 76045)):
 #0  0x00007f74c933a49e in epoll_wait (epfd=44, events=0x7f74ac000b60, maxevents=2, timeout=-1) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
 #1  0x00005640870eafa6 in compat_epoll_wait (events=0x7f74b67fb170, timeout=-1, interruptible=false) at compat/poll.cpp:280
 #2  0x00005640870723db in thread_application_management (data=0x56408765ebd0) at manage-apps.cpp:93
 #3  0x0000564087062b1a in launch_thread (data=0x56408765ec60) at thread.cpp:66
 #4  0x00007f74c9417609 in start_thread (arg=<optimized out>) at pthread_create.c:477
 #5  0x00007f74c933a163 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

 Thread 10 (Thread 0x7f74b6ffd700 (LWP 76044)):
 #0  0x00007f74c933a49e in epoll_wait (epfd=39, events=0x7f74a8000b60, maxevents=2, timeout=-1) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
 #1  0x00005640870eafa6 in compat_epoll_wait (events=0x7f74b6ffc130, timeout=-1, interruptible=false) at compat/poll.cpp:280
 #2  0x0000564087070a27 in thread_application_registration (data=0x56408765e940) at register.cpp:214
 #3  0x0000564087062b1a in launch_thread (data=0x56408765e9f0) at thread.cpp:66
 #4  0x00007f74c9417609 in start_thread (arg=<optimized out>) at pthread_create.c:477
 #5  0x00007f74c933a163 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

 Thread 9 (Thread 0x7f74b77fe700 (LWP 76043)):                                                                                                                                        [654/709]
 #0  syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
 #1  0x00005640870c8e25 in futex (uaddr=0x5640871e2800 <ust_cmd_queue>, op=0, val=-1, timeout=0x0, uaddr2=0x0, val3=0) at /home/joraj/lttng/master/install/include/urcu/futex.h:72
 #2  0x00005640870c8e6d in futex_async (uaddr=0x5640871e2800 <ust_cmd_queue>, op=0, val=-1, timeout=0x0, uaddr2=0x0, val3=0) at /home/joraj/lttng/master/install/include/urcu/futex.h:104
 #3  0x00005640870c939a in futex_nto1_wait (futex=0x5640871e2800 <ust_cmd_queue>) at futex.cpp:77
 #4  0x000056408706f2af in thread_dispatch_ust_registration (data=0x56408765e740) at dispatch.cpp:453
 #5  0x0000564087062b1a in launch_thread (data=0x56408765e760) at thread.cpp:66
 #6  0x00007f74c9417609 in start_thread (arg=<optimized out>) at pthread_create.c:477
 #7  0x00007f74c933a163 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

 Thread 8 (Thread 0x7f74b7fff700 (LWP 76042)):
 #0  0x00007f74c933a49e in epoll_wait (epfd=33, events=0x7f74b0000b60, maxevents=2, timeout=-1) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
 #1  0x00005640870eafa6 in compat_epoll_wait (events=0x7f74b7ffad40, timeout=-1, interruptible=false) at compat/poll.cpp:280
 #2  0x000056408706c424 in thread_manage_clients (data=0x56408765e4f0) at client.cpp:2528
 #3  0x0000564087062b1a in launch_thread (data=0x56408765e560) at thread.cpp:66
 #4  0x00007f74c9417609 in start_thread (arg=<optimized out>) at pthread_create.c:477
 #5  0x00007f74c933a163 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

 Thread 7 (Thread 0x7f74c4b8f700 (LWP 76041)):                                                                                                                                        [672/709]
 #0  0x00007f74c933a49e in epoll_wait (epfd=31, events=0x7f74bc000b60, maxevents=3, timeout=-1) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
 #1  0x00005640870eafa6 in compat_epoll_wait (events=0x7f74c4b8e240, timeout=-1, interruptible=false) at compat/poll.cpp:280
 #2  0x000056408705f2b6 in thread_rotation (data=0x56408765e280) at rotation-thread.cpp:804
 #3  0x0000564087062b1a in launch_thread (data=0x56408765e310) at thread.cpp:66
 #4  0x00007f74c9417609 in start_thread (arg=<optimized out>) at pthread_create.c:477
 #5  0x00007f74c933a163 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

 Thread 6 (Thread 0x7f74c5390700 (LWP 76040)):
 #0  0x00007f74c925f1d2 in __GI___sigtimedwait (set=0x7f74c538f090, info=0x7f74c538f110, timeout=0x0) at ../sysdeps/unix/sysv/linux/sigtimedwait.c:29
 #1  0x000056408706138a in thread_timer (data=0x7ffc1fcbe3f0) at timer.cpp:359
 #2  0x0000564087062b1a in launch_thread (data=0x56408765e0a0) at thread.cpp:66
 #3  0x00007f74c9417609 in start_thread (arg=<optimized out>) at pthread_create.c:477
 #4  0x00007f74c933a163 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

 Thread 5 (Thread 0x7f74c5b91700 (LWP 76039)):
 #0  __libc_write (nbytes=8, buf=0x7f74c5b8fc88, fd=24) at ../sysdeps/unix/sysv/linux/write.c:26
 #1  __libc_write (fd=24, buf=0x7f74c5b8fc88, nbytes=8) at ../sysdeps/unix/sysv/linux/write.c:24
 #2  0x00005640870eeb4f in lttng_write (fd=24, buf=0x7f74c5b8fc88, count=8) at readwrite.cpp:77
 #3  0x000056408704b535 in run_command_no_wait (handle=0x56408765ddd0, in_cmd=0x7f74c5b8fdf0) at notification-thread-commands.cpp:92
 #4  0x000056408704bf49 in notification_thread_client_communication_update (handle=0x56408765ddd0, id=2, transmission_status=CLIENT_TRANSMISSION_STATUS_QUEUED) at notification-thread-command
 #5  0x000056408707bc62 in client_handle_transmission_status (client=0x7f74b80050d0, status=CLIENT_TRANSMISSION_STATUS_QUEUED, user_data=0x7f74b8004410) at action-executor.cpp:258
 #6  0x0000564087057525 in notification_client_list_send_evaluation (client_list=0x7f74b8004df0, trigger=0x7f74b0001030, evaluation=0x7f74b815d1d0, source_object_creds=0x0, client_report=0x5
 #7  0x000056408707bce9 in action_executor_notify_handler (executor=0x7f74b8004410, work_item=0x7f74b815d430, item=0x7f74b80e48e0) at action-executor.cpp:269
 #8  0x000056408707dd6d in action_executor_generic_handler (executor=0x7f74b8004410, work_item=0x7f74b815d430, item=0x7f74b80e48e0) at action-executor.cpp:670
 #9  0x000056408707df01 in action_work_item_execute (executor=0x7f74b8004410, work_item=0x7f74b815d430) at action-executor.cpp:689

 #10 0x000056408707e525 in action_executor_thread (_data=0x7f74b8004410) at action-executor.cpp:771                                                                                   [698/709]
 #11 0x0000564087062b1a in launch_thread (data=0x7f74b80044b0) at thread.cpp:66
 #12 0x00007f74c9417609 in start_thread (arg=<optimized out>) at pthread_create.c:477
 #13 0x00007f74c933a163 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

 Thread 4 (Thread 0x7f74c6392700 (LWP 76038)):
 #0  __lll_lock_wait (futex=futex@entry=0x56408765dde8, private=0) at lowlevellock.c:52
 #1  0x00007f74c941a0a3 in __GI___pthread_mutex_lock (mutex=0x56408765dde8) at ../nptl/pthread_mutex_lock.c:80
 #2  0x0000564087053c89 in handle_notification_thread_command (handle=0x56408765ddd0, state=0x7f74c63911b0) at notification-thread-events.cpp:3142
 #3  0x000056408704ac81 in thread_notification (data=0x56408765ddd0) at notification-thread.cpp:715
 #4  0x0000564087062b1a in launch_thread (data=0x56408765dec0) at thread.cpp:66
 #5  0x00007f74c9417609 in start_thread (arg=<optimized out>) at pthread_create.c:477
 #run_command_no_wait6  0x00007f74c933a163 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

 Thread 3 (Thread 0x7f74c6b93700 (LWP 76037)):
 #0  0x00007f74c933a49e in epoll_wait (epfd=21, events=0x7f74c0000b60, maxevents=2, timeout=-1) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
 #1  0x00005640870eafa6 in compat_epoll_wait (events=0x7f74c6b92170, timeout=-1, interruptible=false) at compat/poll.cpp:280
 #2  0x000056408706400a in thread_manage_health (data=0x56408765db50) at health.cpp:140
 #3  0x0000564087062b1a in launch_thread (data=0x56408765dbf0) at thread.cpp:66
 #4  0x00007f74c9417609 in start_thread (arg=<optimized out>) at pthread_create.c:477
 #5  0x00007f74c933a163 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

 Thread 2 (Thread 0x7f74c7394700 (LWP 76036)):
 #0  syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
 #1  0x00007f74c987d238 in futex (uaddr=0x564087659b10, op=0, val=-1, timeout=0x0, uaddr2=0x0, val3=0) at ../include/urcu/futex.h:72
 #2  futex_async (uaddr=0x564087659b10, op=0, val=-1, timeout=0x0, uaddr2=0x0, val3=0) at ../include/urcu/futex.h:104
 #3  futex_wait (futex=0x564087659b10) at workqueue.c:136
 #4  0x00007f74c987ced2 in workqueue_thread (arg=0x564087659ad0) at workqueue.c:237
 #5  0x00007f74c9417609 in start_thread (arg=<optimized out>) at pthread_create.c:477
 #6  0x00007f74c933a163 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

 Thread 1 (Thread 0x7f74c73cd300 (LWP 76034)):
 #0  0x00007f74c933a49e in epoll_wait (epfd=50, events=0x564087666880, maxevents=1, timeout=-1) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
 #1  0x00005640870eafa6 in compat_epoll_wait (events=0x7ffc1fcbe280, timeout=-1, interruptible=false) at compat/poll.cpp:280
 #2  0x0000564087062244 in sessiond_wait_for_quit_pipe (timeout_ms=-1) at thread-utils.cpp:83
 #3  0x00005640870127dc in main (argc=1, argv=0x7ffc1fcbe668) at main.cpp:1921

Cause
=====

The event_pipe used to notify the notification poll loop is full and the
lttng_write call blocks with the locks for both the client and the
cmd_queue held.

Solution
========

Go back to using eventfd but without the use of EFD_SEMAPHORE (linux
2.6.30) to continue supporting kernel between 2.6.27 and 2.6.29.

The EFD_SEMAPHORE is emulated with a read, decrement, write as explained
by the initial committer of EFD_SEMAPHORE [1].

Known drawbacks
=========

This does not solve the actual block+lock problem but simply push it
back further. The lttng_write on the eventfd can block when reaching
UINT64_MAX. This would represent, at 1 command queued per ns (which is
ridiculous), ~584 years of queueing without a dequeue operation.

Reference
=======
[1] https://lwn.net/Articles/318151/

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ie749c4169708f57463fe3cfab2366f1015bae4e0

2 years agoBuild fix: missing type traits on gcc < 5.0
Jérémie Galarneau [Fri, 8 Apr 2022 19:09:16 +0000 (15:09 -0400)] 
Build fix: missing type traits on gcc < 5.0

gcc versions before 5.0 lack some type traits defined in C++11. Since in
this instance we use the trait to prevent misuses of certain functions
to statically assert at build time and not to generate different code
based on this property, it is preferable to simply set value to true and
allow the code to compile. Anyone using a contemporary compiler will
catch the error.

I have not replaced the type trait checks with macros using gcc-specific
checks (__has_trivial_copy(), for example) since their semantics diverge
subtly from the standard and their use could introduce bugs.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Id57cc1cff67847c725f75eb3404443732de1c531

2 years agoBuild fix: poll compatibility mode: zmalloc prototype changed
Jérémie Galarneau [Fri, 8 Apr 2022 18:22:11 +0000 (14:22 -0400)] 
Build fix: poll compatibility mode: zmalloc prototype changed

The build fails on platforms that don't support the epoll system
call (or when building with --disable-epoll on Linux):

  compat/poll.cpp:458:35: error: no matching function for call to 'zmalloc'
          wait->events = (struct pollfd *) zmalloc(size * sizeof(struct pollfd));
                                           ^~~~~~~
  ./macros.hpp:85:4: note: candidate template ignored: couldn't infer template argument 'T'
  T *zmalloc(size_t size)
     ^
  ./macros.hpp:74:4: note: candidate function template not viable: requires 0 arguments, but 1 was provided
  T *zmalloc()
     ^
  compat/poll.cpp:466:38: error: no matching function for call to 'zmalloc'
          current->events = (struct pollfd *) zmalloc(size * sizeof(struct pollfd));
                                              ^~~~~~~
  ./macros.hpp:85:4: note: candidate template ignored: couldn't infer template argument 'T'
  T *zmalloc(size_t size)
     ^
  ./macros.hpp:74:4: note: candidate function template not viable: requires 0 arguments, but 1 was provided
  T *zmalloc()

Replace the uses of "old style" malloc with the new type-safe
function introduced recently.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ib1660f6a548c155f021843b7476d5d64c06c6e5a

2 years agoFix: sessiond: inverted condition checking for empty hash table
Jérémie Galarneau [Wed, 6 Apr 2022 17:37:16 +0000 (13:37 -0400)] 
Fix: sessiond: inverted condition checking for empty hash table

I inverted a condition while reformating 2a6ebf6bd. This reverts the
condition to the intention of the original author. Mea culpa.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I8c39a89f430dbb4a0f1e385b3b8e4788f188a468

2 years agoFix: notification: kernel: consumption of event notification stalls
Jonathan Rajotte [Fri, 1 Apr 2022 12:41:17 +0000 (08:41 -0400)] 
Fix: notification: kernel: consumption of event notification stalls

Observed issue
==============

Using:

 lttng add-trigger --condition event-rule-matches --type kernel:tracepoint --name "sched_waking" --capture comm --action notify

The sessiond receives multiple event notifications from the kernel event
source then stop receiving despite the kernel event source buffer
being full.

Cause
=====

It turns out that the kernel event source, when reaching near the end of
its buffer capacity, raises the POLLPRI [1] flag and not the POLLIN
flag.

Solution
========

lttng-modules stretches a bit the usage of POLLPRI as defined by the man
page (man 2 poll):

 There is some exceptional condition on the  file  descriptor. Possibilities
 include:

 *  There is out-of-band data on a TCP socket (see tcp(7)).

 *  A  pseudoterminal  master  in  packet  mode has seen a state change on the
    slave (see ioctl_tty(2)).

 *  A cgroup.events file has been modified (see cgroups(7)).

Still, even if lttng-modules changes how it does things, lttng-tools
needs to support other lttng-modules versions.

Thus, add LPOLLPRI (EPOLLPRI/POLLPRI) to the event mask when dealing
with notification event sources.

Note
=====

In the future, during the poll loop we could also prioritize
event sources in POLLPRI 'state'.

Known drawbacks
=========

None.

References
==========

[1] https://github.com/lttng/lttng-modules/blob/c312bda00d2dc10ce5f6c1189acbefee5c6c8c6c/src/lttng-abi.c#L1169

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ieb428ef1037c8eb197b489a38a1ae5216ac63d4b

2 years agoFix: notification: assert on len > 0 for dropped notification message
Jonathan Rajotte [Thu, 31 Mar 2022 13:46:15 +0000 (09:46 -0400)] 
Fix: notification: assert on len > 0 for dropped notification message

Observed issue
==============

Using the notification client from
doc/examples/trigger-condition-event-matches/notification-client.cpp, an
assert is hit when the notification subsystem is under load.

 #0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
 #1  0x00007f69eab58859 in __GI_abort () at abort.c:79
 #2  0x00007f69eab58729 in __assert_fail_base (fmt=0x7f69eacee588 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x7f69eae1d5dd "len > 0", file=0x7f69eae1d5cb "unix.cpp", line=179, function=<optimized out>) at assert.c:92
 #3  0x00007f69eab6a006 in __GI___assert_fail (assertion=0x7f69eae1d5dd "len > 0", file=0x7f69eae1d5cb "unix.cpp", line=179, function=0x7f69eae1d598 "ssize_t lttcomm_recv_unix_sock(int, void*, size_t)") at assert.c:101
 #4  0x00007f69eadd5fe6 in lttcomm_recv_unix_sock (sock=3, buf=0x55da9ecd5f89, len=0) at unix.cpp:179
 #5  0x00007f69ead7df3f in receive_message (channel=0x55da9ecd6ee0) at channel.cpp:64
 #6  0x00007f69ead7e478 in lttng_notification_channel_get_next_notification (channel=0x55da9ecd6ee0, _notification=0x7ffdefed2570) at channel.cpp:279
 #7  0x000055da9e0e742f in main (argc=2, argv=0x7ffdefed2698) at notification-client.cpp:506

 (gdb) frame
 #5  0x00007f69ead7df3f in receive_message (channel=0x55da9ecd6ee0) at channel.cpp:64
 64              ret = lttcomm_recv_unix_sock(channel->socket,

 (gdb) print msg
 $2 = {type = 5 '\005', size = 0, fds = 0, payload = 0x7ffdefed24a8 ""}

The msg type 5 is
`LTTNG_NOTIFICATION_CHANNEL_MESSAGE_TYPE_NOTIFICATION_DROPPED`

Cause
=====

The msg portion of a
`LTTNG_NOTIFICATION_CHANNEL_MESSAGE_TYPE_NOTIFICATION_DROPPED` is indeed
zero. There is no extra payload.

Solution
========

When the msg size is zero, skip the 'payload' reception phase.

Known drawbacks
=========

None.

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ibabb922d0e410c9902414a5eabbe04738861d772

2 years agoFix: example: print_notification is called on status all returned status
Jonathan Rajotte [Thu, 31 Mar 2022 13:44:24 +0000 (09:44 -0400)] 
Fix: example: print_notification is called on status all returned status

The notification should only be printed for
`LTTNG_NOTIFICATION_CHANNEL_STATUS_OK`.

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I5534406d8fbd5c0fff7013fda6335d54bef071a2

2 years agoFix: sessiond: assertion hit in ltt_sessions_ht_empty
Jonathan Rajotte [Mon, 28 Mar 2022 20:49:17 +0000 (16:49 -0400)] 
Fix: sessiond: assertion hit in ltt_sessions_ht_empty

Observed issue
==============

Scenario:

gdb lttng-sessiond
  set non-stop
  break rotation-thread.cpp:584
  ^ simulates a slow rotation thread or not scheduled thread.

lttng create test1
lttng enable-event -u -a
lttng start test1
lttng create test2
lttng enable-event -u -a
lttng start test2
lttng destroy test1
   This will hang on rotation pending checks on the CLI side.

In another shell:

lttng destroy test2
   This will hang on rotation pending checks on the CLI side.

Back to gdb
   thread 7
   continue

Results in:

 #0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
 #1  0x00007ffff786c859 in __GI_abort () at abort.c:79
 #2  0x00007ffff786c729 in __assert_fail_base (fmt=0x7ffff7a02588 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x5555556bb148 "count == lttng_ht_get_count(ltt_sessions_ht_by_name)", file=0x5555556bae9f "session.cpp", line=395, function=<optimized out>) at assert.c:92
 #3  0x00007ffff787e006 in __GI___assert_fail (assertion=0x5555556bb148 "count == lttng_ht_get_count(ltt_sessions_ht_by_name)", file=0x5555556bae9f "session.cpp", line=395, function=0x5555556bb129 "int ltt_sessions_ht_empty()") at assert.c:101
 #4  0x0000555555586d59 in ltt_sessions_ht_empty () at session.cpp:395
 #5  0x0000555555586e53 in del_session_ht (ls=0x7fffdc000c30) at session.cpp:418
 #6  0x0000555555588a95 in session_release (ref=0x7fffdc001e50) at session.cpp:999
 #7  0x000055555558620f in urcu_ref_put (ref=0x7fffdc001e50, release=0x5555555886eb <session_release(urcu_ref*)>) at /home/joraj/lttng/master/install/include/urcu/ref.h:68
 #8  0x0000555555588c8f in session_put (session=0x7fffdc000c30) at session.cpp:1048
 #9  0x00005555555bf995 in handle_job_queue (handle=0x55555575d260, state=0x7fffeeffc240, queue=0x555555758960) at rotation-thread.cpp:612
 #10 0x00005555555c05da in thread_rotation (data=0x55555575d260) at rotation-thread.cpp:847
 #11 0x00005555555c3b1c in launch_thread (data=0x55555575d2f0) at thread.cpp:66
 #12 0x00007ffff7a46609 in start_thread (arg=<optimized out>) at pthread_create.c:477
 #13 0x00007ffff7969163 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Other scenarios can lead to a similar backtrace when using the
`--no-wait` lttng destroy option.

Cause
=====

Since ed41e5709047ef545aa28082416e641e003b45e0 [1], hash table removal
for the session object for the `ltt_sessions_ht_by_name` and
`ltt_sessions_ht_by_name` are "decoupled". Removal from
`ltt_sessions_ht_by_name` is done early in `session_destroy()` while
removal from `ltt_sessions_ht_by_id` is done during `session_release` when
the last reference of a session object is released.

This can leads to `imbalances` between the size of the two hash tables
when multiple sessions are at play.

Solution
========

Rework `ltt_sessions_ht_empty()` to exit early when
`ltt_sessions_ht_by_id` is not empty. Perform a sanity check on
`ltt_sessions_ht_by_name` only when `ltt_sessions_ht_by_id` is empty.

Note
========

Ideally both hash tables' lifetime would be managed separately but it
seems easier in term of initialization to bundle them together for now
considering the limited scope of the `ltt_sessions_ht_by_name` hash
table.

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I66c459f80298f929add703ac977cccd1da6dd556

2 years agoFix: tests: missing _GNU_SOURCE for F_GETPIPE_SZ
Jonathan Rajotte [Thu, 31 Mar 2022 15:20:01 +0000 (11:20 -0400)] 
Fix: tests: missing _GNU_SOURCE for F_GETPIPE_SZ

Per man 2 fcntl:

  F_GETOWN_EX,  F_SETOWN_EX,  F_SETPIPE_SZ,  F_GETPIPE_SZ,  F_GETSIG,  F_SETSIG,
  F_NOTIFY,   F_GETLEASE,   and  F_SETLEASE  are  Linux-specific.   (Define  the
  _GNU_SOURCE macro to obtain these definitions.)

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I2b61dfb79ffa384dc2bab56cd3510ddc6ae21e85

2 years agoFix: compat: 'LTTNG_UST_ABI_PROCNAME_LEN' is undeclared
Jonathan Rajotte [Tue, 29 Mar 2022 20:31:44 +0000 (16:31 -0400)] 
Fix: compat: 'LTTNG_UST_ABI_PROCNAME_LEN' is undeclared

Observed issue
==============

On old systems, the `lttng_pthread_setname_np` function fallsback to
using the compat prctl version. In that context,
`LTTNG_UST_ABI_PROCNAME_LEN` is indeed not declared.

Solution
========

Use `LTTNG_PTHREAD_NAMELEN`. This mimics what is done in other versions
of `lttng_pthread_setname_np`.

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I73956cacd7b1e9400881b17b1cd89db2530d3a00

2 years agocommon: prevent using memset on non-POD types
Simon Marchi [Wed, 8 Sep 2021 22:00:42 +0000 (18:00 -0400)] 
common: prevent using memset on non-POD types

While converting some code to use C++ constructs, it can be easy to
forget to change some spot that uses memset to initialize or move the
object. Add a templated deleted declaration to prevent using memset on
types that aren't POD.

For example, if I make lttng_ust_event non-POD, in
src/bin/lttng-sessiond/trace-ust.h, I get this error:

      CXX      save.lo
    /home/simark/src/lttng-tools/src/bin/lttng-sessiond/save.cpp: In function ‘int save_agent_events(config_writer*, agent*)’:
    /home/simark/src/lttng-tools/src/bin/lttng-sessiond/save.cpp:1246:23: error: use of deleted function ‘void* memset(T*, int, size_t) [with T = ltt_ust_event; <template-parameter-1-2> = void; size_t = long unsigned int]’
     1246 |                 memset(&fake_event, 0, sizeof(fake_event));
          |                 ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    In file included from /home/simark/src/lttng-tools/src/common/defaults.h:14,
                     from /home/simark/src/lttng-tools/src/bin/lttng-sessiond/save.cpp:15:
    /home/simark/src/lttng-tools/src/common/macros.h:128:7: note: declared here
      128 | void *memset(T *s, int c, size_t n) = delete;
          |       ^~~~~~

Note: I tried applying this to memcpy as well, but Clang gave me some
troubles with its -Waddress-of-packed-member diagnostic, so I gave up.

Change-Id: Id55735db15901c6fc5d58e9b6b6b689733302398
Signed-off-by: Simon Marchi <simon.marchi@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
2 years agoTests: validate_xml: leak of xml document instance
Jérémie Galarneau [Thu, 31 Mar 2022 02:59:44 +0000 (22:59 -0400)] 
Tests: validate_xml: leak of xml document instance

`doc` is never free'd when validating an XML.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ia0b541963350aeb6610382fd3226ffd37ab4847e

2 years agoAdd type-checked versions of allocation and deallocations functions
Simon Marchi [Wed, 17 Nov 2021 02:36:17 +0000 (21:36 -0500)] 
Add type-checked versions of allocation and deallocations functions

A common mistake when porting things from C to C++ is to use malloc for
types that have a on-trivial constructor (and to not call the
constructor explicitly either). For example:

struct foo {
std::vector<int> field;
};

foo *f = (foo *) zmalloc(sizeof(*f));

This allocates a `foo` without calling the constructor, leaving it in an
invalid state. Same idea when free-ing with free something that has a
non-trivial destructor.

To avoid this, I suggest adding templated allocation functions that we
will use throughout, that verify if the given type is safe to malloc
(and generate a compilation failure if not). The existing code barely
needs changes. For example:

- (foo *) zmalloc(sizeof(*f))
+ zmalloc<foo>()

For simplicity I propose that as soon as a type is non-POD
(std::is_pod<T>::value is false), we prevent using malloc/free on it. It
would be ok in theory to allocate such a type with malloc and free with
free, but call the constructor (using placement-new) and destructor
explicitly, but I don't see why we would want to do that. It might also
be technically more correct to use a combination of
std::is_trivially_constructible and std::is_trivially_destructible
(std::is_pod being not fine-grained enough), but using std::is_pod just
keeps things simpler.

This patch introduces the following templated allocation functions:

1. zmalloc<T>()
2. zmalloc<T>(size)
3. malloc<T>()
4. malloc<T>(size)
5. calloc<T>(nmemb)

1. Allocate one T, zero-initialized
2. Allocate a buffer of size `size`, zero-initialized, this is used when
   the caller calculates the size to allocate, like when using flexible
   array members
3. Same as 1, but without the zero-initialization
4. Same as 2, but without the zero-initialization
5. Allocate an array of `nmemb` elements of type T, zero-initialized

For the de-allocation side, add templated `free` function declaration
that uses std::enable_if (SFINAE) to declare a deleted prototype if the
type T isn't safe to free (causing a compilation error).

There are a lot of places where we pass pointers to void to free. These
can't be checked, as we don't know what type of object the pointer
really points to. We could forbid that and fix all callers to pass a
typed pointer, but that seems a bit too much to chew for the moment. So
for now, simply accept that freeing pointers to void won't be checked.
It's a best effort.

As an example, if I add an explicit constructor to type ctf_trace (in
src/bin/lttng-relayd/ctf-trace.h), I get the following errors with
clang. For the allocation:

/home/simark/src/lttng-tools/src/common/macros.h:57:2: error: static_assert failed due to requirement 'std::is_pod<ctf_trace>::value' "type is POD"
static_assert (std::is_pod<T>::value, "type is POD");
^              ~~~~~~~~~~~~~~~~~~~~~
/home/simark/src/lttng-tools/src/bin/lttng-relayd/ctf-trace.cpp:84:10: note: in instantiation of function template specialization 'zmalloc<ctf_trace>' requested here
trace = zmalloc<ctf_trace>();
^

For the de-allocation:

/home/simark/src/lttng-tools/src/bin/lttng-relayd/ctf-trace.cpp:29:2: error: call to deleted function 'free'
free(trace);
^~~~
/home/simark/src/lttng-tools/src/common/macros.h:125:6: note: candidate function [with T = ctf_trace, $1 = void] has been explicitly deleted
void free(T *p) = delete;
     ^
/usr/include/stdlib.h:565:13: note: candidate function
extern void free (void *__ptr) __THROW;
    ^

Change-Id: I246a9113d08fa36b81a49137f4e80a5e808de913
Signed-off-by: Simon Marchi <simon.marchi@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
2 years agosessiond: document RCU locking assumption during channel metadata statedump
Jérémie Galarneau [Wed, 30 Mar 2022 13:23:46 +0000 (09:23 -0400)] 
sessiond: document RCU locking assumption during channel metadata statedump

The rcu read lock must be held by the caller during a call to
ust_metadata_channel_statedump. An assertion and a comment are added.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ia57a140d51470cc43cf62d36c9b4b552e1c17191

2 years agoFix: lttng-sessiond: output stream metadata before events
Simon Marchi [Thu, 6 Jan 2022 18:24:59 +0000 (13:24 -0500)] 
Fix: lttng-sessiond: output stream metadata before events

When trying the `doc/examples/demo` example from the lttng-ust
repository, the resulting trace's metadata lists some events before the
corresponding stream declaration. Here's an excerpt:

    event {
            name = "lttng_ust_statedump:end";
        id = 5;
        stream_id = 0;
        loglevel = 13;
        fields := struct {
            };
    };

    stream {
            id = 0;
            event.header := struct event_header_large;
            packet.context := struct packet_context;
    };

I don't know if this is allowed in CTF 1, but it won't be in CTF 2 (an
event record class fragment must come after its parent data stream class
fragment). In any case, I think it makes more sense to have the stream
first.

What I can see is that the ust_metadata_event_statedump function (which
emits the `event` declarations) is called for the statedump events
before the ust_metadata_channel_statedump function (which emits the
`stream` declaration) is called. A simple fix, as implemented in this
patch, is to delay emitting the event declarations until the stream
declaration has been emitted. To do so, return early in
ust_metadata_event_statedump if the `chan->metadata_dumped` flag is not
set. Then, when emitting the stream declaration, in
ust_metadata_event_statedump, emit any existing event, which have
presumably been skipped before hand.

It's possible that ust_metadata_event_statedump getting called before
ust_metadata_channel_statedump is a symptom of some more fundamental
problem over which this patch only papers over, but I don't know enough
about this to be able to tell.

I couldn't think of an appropriate test to write for this. However, once
we generate CTF2, such a bug would likely be caught by trace readers
rejecting the invlid metadata. So if we were to re-introduce this bug,
we would notice.

Change-Id: I6e3158c801fcc01b318618890704d19b3230e7a5
Signed-off-by: Simon Marchi <simon.marchi@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
2 years agolttng-sessiond: remove goto in ust_metadata_channel_statedump
Simon Marchi [Thu, 6 Jan 2022 18:27:55 +0000 (13:27 -0500)] 
lttng-sessiond: remove goto in ust_metadata_channel_statedump

A follow-up patch uses an std::vector declared in the middle of
ust_metadata_channel_statedump. This isn't compatible with the
goto-based error handling, since gotos should not jump over object
initialization (otherwise, the object gets destroyed without having been
constructed).

Moving the std::vector declaration to the beginning of the function
would work, but it would be a pessimization: we would construct an
object that we may not need, depending on the code path taken. We
therefore want to declare (and construct) the std::vector just before
we need it.

Fix this by replacing gotos with return statements.

Also, add a `ret` check after the last lttng_metadata_printf call. If
this call failed, for some reason, we would return an error, but still
set chan->metadata_dumped. That makes this case different than the other
error paths in the function, where chan->metadata_dumped doesn't get
set. Adding the check makes this case like the other ones.

Change-Id: Iba81422a7c3bac96a8d209bba6b4d53ad26b3e4e
Signed-off-by: Simon Marchi <simon.marchi@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
2 years agolttng-sessiond: rename ust_registry_channel::ht to events
Simon Marchi [Thu, 6 Jan 2022 17:49:09 +0000 (12:49 -0500)] 
lttng-sessiond: rename ust_registry_channel::ht to events

For clarity, rename the field to "events" to indicate that it contains
the channel's events.

Change-Id: I0bd90c13d7c8e313fff72eb18d0d7ebfc23762d4
Signed-off-by: Simon Marchi <simon.marchi@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
2 years agoClean-up: Remove remaining max_t macros
Simon Marchi [Wed, 15 Dec 2021 19:23:05 +0000 (14:23 -0500)] 
Clean-up: Remove remaining max_t macros

I found two remaining max_t macros. Remove them, and adjust one call
site that was still using that.

Change-Id: Icaedcaea1a88e87262bfa544691db398a1bfd203
Signed-off-by: Simon Marchi <simon.marchi@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
2 years agoRename C++ header files to .hpp
Simon Marchi [Wed, 15 Dec 2021 20:13:05 +0000 (15:13 -0500)] 
Rename C++ header files to .hpp

Rename all C++ header files (include/**/*-internal.h, src/**/*.h except
argpar and msgpack, some headers in tests) to have the .hpp extension.

Doing so highlights that we include some C++ header files in some test
files still compiled as C. This is ok for now, as the files they include
don't actually contain C++ code incompatible with C yet, but they could
eventually. This is something we can fix later.

Change-Id: I8bf326b6b2946a3e26704f3ef3ac5831bbe9bc26
Signed-off-by: Simon Marchi <simon.marchi@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
2 years agoClean-up: sessiond: cmd_enable_channel_internal
Jérémie Galarneau [Fri, 25 Mar 2022 19:43:56 +0000 (15:43 -0400)] 
Clean-up: sessiond: cmd_enable_channel_internal

After catching an error code mixup in cmd_enable_channel_internal, this
change explicitly sets the return type of cmd_enable_channel_internal
and of its callees to `enum lttng_error_code` since those already return
these values as integers.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ic3c042c2ee3d259cc694e6aaf3a1a2f3ca843042

2 years agoFix: sessiond: cmd_enable_channel: negative error code used
Jérémie Galarneau [Fri, 25 Mar 2022 19:34:47 +0000 (15:34 -0400)] 
Fix: sessiond: cmd_enable_channel: negative error code used

A negative `lttng_error_code` value is returned (as an integer)
when a channel copy fails. Return a positive error code.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I340f739cc33858a06832bb75a7a6d5e18459551f

2 years agoClean-up: remove unused makefile statements
Michael Jeanson [Mon, 22 Nov 2021 20:58:38 +0000 (15:58 -0500)] 
Clean-up: remove unused makefile statements

Change-Id: Ic8b6a68d64b866f177e6aa02a15c9930b468238a
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
2 years agoUpdate vendored msgpack-c to 4.0.0
Michael Jeanson [Thu, 24 Mar 2022 18:29:36 +0000 (14:29 -0400)] 
Update vendored msgpack-c to 4.0.0

The upstream changes from 3.3.0 to 4.0.0 :

  * Fix and improve alignment logic (#962)
  * Fix iovec name conflict (#953)
  * Fix empty string print (#942)
  * Fix buffer ptr size (#899)
  * Fix UB. Check null pointer before using memcpy() (#890)

Change-Id: Ifc4d7de43d0f11d6331d98d7cfa93227f8b756bc
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
2 years agoFix: doc: action: wrong function documented for action_list destroy
Jonathan Rajotte [Thu, 24 Mar 2022 20:05:20 +0000 (16:05 -0400)] 
Fix: doc: action: wrong function documented for action_list destroy

The lttng_action_list_destroy function is internal.

API users must use `lttng_action_destroy()` to destroy the returned
object of `lttng_action_list_create()`.

Change-Id: Ic910efd07dd071f7e38e48d34a5e000b3f805729
Reported-by: Michael Jeason <mjeanson@efficios.com>
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
2 years agoFix: lttng-elf: wrong error label used by error path
Jérémie Galarneau [Fri, 18 Mar 2022 14:58:03 +0000 (10:58 -0400)] 
Fix: lttng-elf: wrong error label used by error path

1486805 Resource leak
The system resource will not be reclaimed and reused, reducing the future availability of the resource.

In lttng_elf_get_symbol_offset: Leak of memory or pointers to system resources (CWE-404)

Reported-by: Coverity Scan
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I78b868773b389c2eaf3f1d45151fe9416b1fe447

2 years agoClean-up: fix '-Wformat' warnings on various platforms
Michael Jeanson [Wed, 16 Mar 2022 18:18:36 +0000 (14:18 -0400)] 
Clean-up: fix '-Wformat' warnings on various platforms

Change-Id: I39a2dd8bb4f1f6654a65f9fab8d5ac74439a4410
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
2 years agoClean-up: fix '-Wimplicit-fallthrough' warnings on various platforms
Michael Jeanson [Wed, 16 Mar 2022 16:06:16 +0000 (12:06 -0400)] 
Clean-up: fix '-Wimplicit-fallthrough' warnings on various platforms

Change-Id: I80ad6bebcd2eed9c3e83ef3b750254d1fd98e95d
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
2 years agoClean-up: fix '-Wnused-parameter' warnings on various platforms
Michael Jeanson [Wed, 16 Mar 2022 15:57:24 +0000 (11:57 -0400)] 
Clean-up: fix '-Wnused-parameter' warnings on various platforms

Change-Id: I35bd06414fd8407b2f281789ac2e419f40a08fa2
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
2 years agoClean-up: fix '-Wundef' warnings on various platforms
Michael Jeanson [Wed, 16 Mar 2022 15:46:49 +0000 (11:46 -0400)] 
Clean-up: fix '-Wundef' warnings on various platforms

Change-Id: I8dfffd2ad5eb55a0b8fe74d29a82a224da19f30a
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
2 years agoClean-up: silence warnings for generated code on RHEL8
Michael Jeanson [Wed, 16 Mar 2022 14:46:33 +0000 (10:46 -0400)] 
Clean-up: silence warnings for generated code on RHEL8

  lttng_wrap.lo -MD -MP -MF .deps/lttng_wrap.Tpo -c lttng_wrap.c  -fPIC -DPIC -o .libs/lttng_wrap.o
  lttng_wrap.c:1824:23: warning: cast between incompatible function types from ‘PyObject * (*)(PyObject *)’ {aka ‘struct _object * (*)(struct _object *)’} to ‘PyObject * (*)(PyObject *, PyObject *)’ {aka ‘struct _object * (*)(struct _object *, struct _object *)’} [-Wcast-function-type]
     {(char *)"disown",  (PyCFunction)SwigPyObject_disown,  METH_NOARGS,  (char *)"releases ownership of the pointer"},
                         ^

Change-Id: I9258a58317814fdc94c8ce3c76e615b73aaf4199
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
2 years agoFix: use the correct endian compat macros
Michael Jeanson [Wed, 16 Mar 2022 15:40:52 +0000 (11:40 -0400)] 
Fix: use the correct endian compat macros

Document which variant of the endian macros our compat header guarantees
across all platforms and fix incorrect uses.

This was discovered with -Wundef on macOS.

Change-Id: Iaf442fe5887063661273ac2a00c9fa4015e83d5c
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
2 years agoClean-up: tests: silence bogus warning
Jérémie Galarneau [Wed, 16 Mar 2022 21:55:45 +0000 (17:55 -0400)] 
Clean-up: tests: silence bogus warning

1486757 Buffer not null terminated
If the buffer is treated as a null terminated string in later operations, a buffer overflow or over-read may occur.

In test_create_ust_event_exclusion(): The string buffer may not have a null terminator if the source string's length is equal to the buffer size (CWE-170)

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I0c3fd6c9d591e1c67d8b80bf12825bb0d1520a65

2 years agoFix: tests: uninitialized lttng_payload
Jérémie Galarneau [Wed, 16 Mar 2022 21:39:07 +0000 (17:39 -0400)] 
Fix: tests: uninitialized lttng_payload

1474980 Uninitialized pointer read
Incorrect values could be read from, or even written to, an arbitrary memory location, causing incorrect computations.

In test_event_rule_userspace_probe(): Reads an uninitialized pointer or its target (CWE-457)

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I7645278f18e4a4678fb5ede9523d0cfa8d3aa106

2 years agoFix: sessiond: ust-app: uninitialized name logged on stream copy failure
Jérémie Galarneau [Wed, 16 Mar 2022 21:35:43 +0000 (17:35 -0400)] 
Fix: sessiond: ust-app: uninitialized name logged on stream copy failure

1466302 Uninitialized scalar variable
The variable will contain an arbitrary value left from earlier computations.

In send_channel_uid_to_ust(buffer_reg_channel *, ust_app *, ust_app_session *, ust_app_channel *): Use of an uninitialized variable (CWE-457)

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Icd38de3b67dab783fa26a721c68c48ebfbb59785

2 years agoFix: lttng-elf: untrusted entry size divisor
Jérémie Galarneau [Wed, 16 Mar 2022 21:29:11 +0000 (17:29 -0400)] 
Fix: lttng-elf: untrusted entry size divisor

1405557 Untrusted divisor
The divisor could be controlled by an attacker, who could cause a division by zero.

In lttng_elf_get_symbol_offset: An unscrutinized value from an untrusted source used as a divisor (CWE-369)

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I029708a0df4f62fe0031e374d50839c26f4f3f4b

2 years agoFix: tests: test definitions arrays contain invalid data
Jonathan Rajotte [Fri, 11 Mar 2022 18:39:58 +0000 (13:39 -0500)] 
Fix: tests: test definitions arrays contain invalid data

Observed issue
==============

The long_regression Ci job fails on test_thread_stall.

 11:17:16 # export LTTNG_SESSION_CONFIG_XSD_PATH=/home/jenkins/workspace/lttng-tools_master_long_regression/arch/amd64/babeltrace_version/stable-2.0/build/std/conf/std/liburcu_version/master/test_type/full/src/lttng-tools/tests/../src/common/
 11:17:16 # env /home/jenkins/workspace/lttng-tools_master_long_regression/arch/amd64/babeltrace_version/stable-2.0/build/std/conf/std/liburcu_version/master/test_type/full/src/lttng-tools/tests/../src/bin/lttng-sessiond/lttng-sessiond --background --consumerd64-path=/home/jenkins/workspace/lttng-tools_master_long_regression/arch/amd64/babeltrace_version/stable-2.0/build/std/conf/std/liburcu_version/master/test_type/full/src/lttng-tools/tests/../src/bin/lttng-consumerd/lttng-consumerd 1
 11:17:16 ok 16 - Start session daemon
 11:17:16 # Check after running for 30 seconds
 11:17:16 not ok 17 - Validation failure
 11:17:16 #   Failed test 'Validation failure'
 11:17:16 #   in /home/jenkins/workspace/lttng-tools_master_long_regression/arch/amd64/babeltrace_version/stable-2.0/build/std/conf/std/liburcu_version/master/test_type/full/src/lttng-tools/tests/regression/tools/health/../../../utils/tap/tap.sh:fail() at line 159.
 11:17:16 # Health returned:
 11:17:16 # stdout:
 11:17:16 # stderr:
 11:17:16 # Killing (signal SIGKILL) lttng-sessiond and lt-lttng-sessiond pids: 1840601 1840602
 11:17:16 ok 18 - Wait after kill session daemon

 ...

 17:57:01 # Test health problem detection with LTTNG_RELAYD_THREAD_DISPATCHER
 17:57:01 # Start session daemon
 17:57:01 # export LTTNG_SESSION_CONFIG_XSD_PATH=/home/jenkins/workspace/lttng-tools_master_long_regression/arch/amd64/babeltrace_version/stable-2.0/build/std/conf/std/liburcu_version/master/test_type/full/src/lttng-tools/tests/../src/common/
 17:57:01 # env /home/jenkins/workspace/lttng-tools_master_long_regression/arch/amd64/babeltrace_version/stable-2.0/build/std/conf/std/liburcu_version/master/test_type/full/src/lttng-tools/tests/../src/bin/lttng-sessiond/lttng-sessiond --background --consumerd64-path=/home/jenkins/workspace/lttng-tools_master_long_regression/arch/amd64/babeltrace_version/stable-2.0/build/std/conf/std/liburcu_version/master/test_type/full/src/lttng-tools/tests/../src/bin/lttng-consumerd/lttng-consumerd 1
 17:57:01 ok 38 - Start session daemon
 17:57:01 # /home/jenkins/workspace/lttng-tools_master_long_regression/arch/amd64/babeltrace_version/stable-2.0/build/std/conf/std/liburcu_version/master/test_type/full/src/lttng-tools/tests/regression/tools/health/../../../../src/bin/lttng/lttng create health_thread_stall --no-output
 17:57:01 ok 39 - Create session health_thread_stall in no-output mode
 17:57:01 # With UST consumer daemons
 17:57:01 # /home/jenkins/workspace/lttng-tools_master_long_regression/arch/amd64/babeltrace_version/stable-2.0/build/std/conf/std/liburcu_version/master/test_type/full/src/lttng-tools/tests/regression/tools/health/../../../../src/bin/lttng/lttng enable-event tp:tptest -c testchan -s health_thread_stall -u
 17:57:01 ok 40 - Enable ust event tp:tptest for session health_thread_stall
 17:57:01 ok 41 # skip: Root access is needed. Skipping kernel consumer health check test.
 17:57:01 # /home/jenkins/workspace/lttng-tools_master_long_regression/arch/amd64/babeltrace_version/stable-2.0/build/std/conf/std/liburcu_version/master/test_type/full/src/lttng-tools/tests/regression/tools/health/../../../../src/bin/lttng/lttng start health_thread_stall
 17:57:01 ok 42 - Start tracing for session health_thread_stall
 17:57:01 # Check after running for 30 seconds
 17:57:01 not ok 43 - Validation failure
 17:57:01 #   Failed test 'Validation failure'
 17:57:01 #   in /home/jenkins/workspace/lttng-tools_master_long_regression/arch/amd64/babeltrace_version/stable-2.0/build/std/conf/std/liburcu_version/master/test_type/full/src/lttng-tools/tests/regression/tools/health/../../../utils/tap/tap.sh:fail() at line 159.
 17:57:01 # Health returned:
 17:57:01 # stdout:
 17:57:01 # stderr:
 17:57:01 # Killing (signal SIGTERM) lttng-consumerd pids: 690297 690299
 17:57:01 Error: consumer closed the command socket
 17:57:01 Error: Health error occurred in thread_consumer_management
 17:57:01 ok 44 - Wait after kill consumer daemon

Cause
=====

After investigation, commit 3c3390532736cfb5198f863d0d2b218e21fcf76d [1]
introduces the test regression.

Albeit [1] removes `LTTNG_SESSIOND_THREAD_HT_CLEANUP` from the `THREAD`
array and the corresponding error message in `ERROR_STRING`, it does not
modify the `NEEDS_ROOT`, `TEST_CONSUMERD` and `TEST_RELAYD` arrays.

Also the test count is not adjusted to reflect the removal of the
`THREAD` element.

Solution
========

Remove the unused data from `NEEDS_ROOT`, `TEST_CONSUMERD` and
`TEST_RELAYD` and adjust the test count.

Known drawbacks
=========

None.

References
==========

[1] https://github.com/lttng/lttng-tools/commit/3c3390532736cfb5198f863d0d2b218e21fcf76d

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I9c16fa8d76b41f1a28fd342d9f076969f4ff1b13

2 years agoClean-up: exclusions: use LTTNG_EVENT_EXCLUSION_NAME_AT util
Jérémie Galarneau [Wed, 16 Mar 2022 20:35:30 +0000 (16:35 -0400)] 
Clean-up: exclusions: use LTTNG_EVENT_EXCLUSION_NAME_AT util

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I59da48be1a2d905d3a7ff485d353fdacddda784d

2 years agoClean-up: lttng-ctl: strnlen out of bounds access
Jérémie Galarneau [Tue, 15 Mar 2022 21:19:27 +0000 (17:19 -0400)] 
Clean-up: lttng-ctl: strnlen out of bounds access

gcc 11.2 produces the following warning. The lttng_strncpy helper
assumes that 'src' is a null terminated string. As such, the use of a
string literal (of size 37) in this specific example is correct as
strnlen will not read beyond the null terminator.

Replacing strnlen by strlen eliminates this warning. strnlen was used to
short-circuit the source length check when it was larger than the
destination. This optimization is unlikely to matter. Pascal-style
strings should be used when string length computations are expected to
be prohibitively expensive.

In file included from ../../../src/common/macros.h:15,
                 from ../../../include/lttng/health-internal.h:18,
                 from lttng-ctl-health.cpp:19:
In function 'size_t lttng_strnlen(const char*, size_t)',
    inlined from 'int lttng_strncpy(char*, const char*, size_t)' at ../../../src/common/macros.h:123:19,
    inlined from 'int set_health_socket_path(lttng_health*, int)' at lttng-ctl-health.cpp:198:22,
    inlined from 'int lttng_health_query(lttng_health*)' at lttng-ctl-health.cpp:319:30:
../../../src/common/compat/string.h:19:23: warning: 'size_t strnlen(const char*, size_t)' specified bound 4096 may exceed source size 37 [-Wstringop-overread]
   19 |         return strnlen(str, max);
      |                ~~~~~~~^~~~~~~~~~

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I290109433fcae7073321f1b48ecfbb2ec6e4ad26

2 years agoClean-up: sessiond-comm: out of bounds access warning
Jérémie Galarneau [Tue, 15 Mar 2022 21:13:03 +0000 (17:13 -0400)] 
Clean-up: sessiond-comm: out of bounds access warning

gcc 11.2 produces the two following warnings. In both case, setting an
array's dimension to zero is used to express a variable length array of
names that are LTTNG_SYMBOL_NAME_LEN bytes long. gcc doesn't know about
this and correctly points out that an access is taking place outside of
the array's bounds.

Omit the '0' dimension to work around this warning.

event.cpp: In function 'ssize_t lttng_event_create_from_payload(lttng_payload_view*, lttng_event**, lttng_event_exclusion**, char**, lttng_bytecode**)':
event.cpp:320:62: warning: array subscript i is outside array bounds of 'char [0][256]' [-Warray-bounds]
  320 |                 ret = lttng_strncpy(local_exclusions->names[i],
      |                                     ~~~~~~~~~~~~~~~~~~~~~~~~~^
In file included from event.cpp:16:
../../src/common/sessiond-comm/sessiond-comm.h:569:14: note: while referencing 'lttng_event_exclusion::names'
  569 |         char names[0][LTTNG_SYMBOL_NAME_LEN];
      |              ^~~~~

event-rule/user-tracepoint.cpp: In function 'lttng_event_rule_generate_exclusions_status lttng_event_rule_user_tracepoint_generate_exclusions(const lttng_event_rule*, lttng_event_exclusion**)':
event-rule/user-tracepoint.cpp:383:61: warning: array subscript i is outside array bounds of 'char [0][256]' [-Warray-bounds]
  383 |                 copy_ret = lttng_strncpy(exclusions->names[i], exclusion_str,
      |                                          ~~~~~~~~~~~~~~~~~~~^
In file included from ../../src/common/runas.h:17,
                 from event-rule/user-tracepoint.cpp:17:
../../src/common/sessiond-comm/sessiond-comm.h:569:14: note: while referencing 'lttng_event_exclusion::names'
  569 |         char names[0][LTTNG_SYMBOL_NAME_LEN];
      |              ^~~~~

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I260185f2baf085ca4486ce3b13696ee5fa55938a

2 years agoFix: event: erroneous bound check on perf counter name size
Jérémie Galarneau [Wed, 16 Mar 2022 15:56:21 +0000 (11:56 -0400)] 
Fix: event: erroneous bound check on perf counter name size

The wrong size if used when initializing a perf counter name from a
payload. The destination size must be used to prevent out of bound
writes.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I8ea41d30815bd2c02bb2ad8b01e8cecd2d6549a8

2 years agoFix: sessiond: event name length check is too strict
Jérémie Galarneau [Wed, 16 Mar 2022 15:55:08 +0000 (11:55 -0400)] 
Fix: sessiond: event name length check is too strict

A truncation check when initializing an event from an event rule limits
the name to one less character than is supposed to be allowed.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I172c5098597923d14508c150c7b3577f759bae72

2 years agoClean-up: use sizeof instead of repeating string length constant
Jérémie Galarneau [Wed, 16 Mar 2022 15:52:42 +0000 (11:52 -0400)] 
Clean-up: use sizeof instead of repeating string length constant

Looking into a number of coverity reports (false positives), I
identified a number of sites which use the maximal symbol length
constant when the actual size of an array can be used. This will prevent
mismatches in the future should the array sizes change.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ia74f43d3871fdce60affbde068401b58c84b09ad

2 years agoFix: relayd: missing session unlock on error path
Jérémie Galarneau [Wed, 16 Mar 2022 14:48:59 +0000 (10:48 -0400)] 
Fix: relayd: missing session unlock on error path

1475890 Missing unlock May result in deadlock if there is another
attempt to acquire the lock.

In viewer_get_new_streams(relay_connection *): Missing a release of a
lock on a path (CWE-667)

Reported-by: Coverity Scan
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I66de344f5f39ac85bf8db93cf39a07d0c6cf7694

2 years agoBring compiler warning flags in line with other projects
Michael Jeanson [Mon, 7 Mar 2022 18:59:46 +0000 (13:59 -0500)] 
Bring compiler warning flags in line with other projects

Change-Id: I0281a357afbace553368cd01357bb2f21de3352d
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
2 years agoPrepare for '-Wsign-compare'
Michael Jeanson [Tue, 8 Mar 2022 23:26:06 +0000 (18:26 -0500)] 
Prepare for '-Wsign-compare'

In preparation for '-Wextra'

Change-Id: I9a3b91009b2b44c0aeacfb37fa2b8b901be79992
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
2 years agoPrepare for '-Wimplicit-fallthrough'
Michael Jeanson [Tue, 8 Mar 2022 23:16:06 +0000 (18:16 -0500)] 
Prepare for '-Wimplicit-fallthrough'

In preparation for '-Wextra'

Change-Id: Ice4c5aa7f6ce9107c88f38ec4024a4631589ad73
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
2 years agoPrepare for '-Wmissing-field-initializers'
Michael Jeanson [Tue, 8 Mar 2022 23:05:26 +0000 (18:05 -0500)] 
Prepare for '-Wmissing-field-initializers'

In preparation for '-Wextra'

Change-Id: Ic593491ad44c1254f158b19659c3b9567d180ad1
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
2 years agoPrepare for '-Wignored-qualifiers'
Michael Jeanson [Tue, 8 Mar 2022 16:33:30 +0000 (11:33 -0500)] 
Prepare for '-Wignored-qualifiers'

In preparation for '-Wextra'

Change-Id: I6734a105170da2d57480fb5e15cae839adc38e62
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
2 years agoPrepare for '-Wunused-parameter'
Michael Jeanson [Tue, 8 Mar 2022 15:52:55 +0000 (10:52 -0500)] 
Prepare for '-Wunused-parameter'

In preparation for '-Wextra'

Change-Id: I30e6abb9502fc97daa565fde450d1e4235cf1ec7
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
2 years agoconfigure: add '-Wredundant-decls' to warning flags
Michael Jeanson [Wed, 9 Mar 2022 16:32:09 +0000 (11:32 -0500)] 
configure: add '-Wredundant-decls' to warning flags

Change-Id: I5329ebe83aab40e6796b506c28e853b4af3c5e99
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
2 years agoconfigure: add '-Wmissing-noreturn' to warning flags
Michael Jeanson [Wed, 9 Mar 2022 15:19:17 +0000 (10:19 -0500)] 
configure: add '-Wmissing-noreturn' to warning flags

Change-Id: I95a981348109d4614afcfe9c85f971e65afc2765
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
2 years agoconfigure: add '-Wlogical-op' to warning flags
Michael Jeanson [Mon, 7 Mar 2022 20:59:30 +0000 (15:59 -0500)] 
configure: add '-Wlogical-op' to warning flags

Change-Id: I0516add62151b22352f96d1e62871a013b8fa6f3
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
2 years agoconfigure: add '-Wundef' to warning flags
Michael Jeanson [Mon, 7 Mar 2022 19:21:21 +0000 (14:21 -0500)] 
configure: add '-Wundef' to warning flags

Change-Id: If47c16121b1679862e7a5f75fce70c7d9973e92e
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
2 years agoconfigure: add '-Wnull-dereference' to warning flags
Michael Jeanson [Mon, 7 Mar 2022 19:02:29 +0000 (14:02 -0500)] 
configure: add '-Wnull-dereference' to warning flags

Change-Id: Ife5ad6963262c5c2715954fcd34c94015fb30aa6
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
2 years agofix: warning '-Wstringop-truncation' with GCC 11.2
Michael Jeanson [Mon, 7 Mar 2022 16:37:49 +0000 (11:37 -0500)] 
fix: warning '-Wstringop-truncation' with GCC 11.2

Building with GCC 11.2 results in the following warning :

  In file included from ../../src/common/tracker.h:18,
                 from ../../src/bin/lttng-sessiond/trace-ust.h:17,
                 from test_ust_data.cpp:19:
../../src/common/sessiond-comm/sessiond-comm.h:569:14: note: while referencing ‘lttng_event_exclusion::names’
  569 |         char names[0][LTTNG_SYMBOL_NAME_LEN];
      |              ^~~~~
test_ust_data.cpp:209:16: warning: ‘char* strncpy(char*, const char*, size_t)’ specified bound 256 equals destination size [-Wstringop-truncation]
  209 |         strncpy(LTTNG_EVENT_EXCLUSION_NAME_AT(exclusion, 0),
      |         ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  210 |                 get_random_string(), LTTNG_SYMBOL_NAME_LEN);
      |                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
test_ust_data.cpp:211:16: warning: ‘char* strncpy(char*, const char*, size_t)’ specified bound 256 equals destination size [-Wstringop-truncation]
  211 |         strncpy(LTTNG_EVENT_EXCLUSION_NAME_AT(exclusion, 1),
      |         ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  212 |                 get_random_string(), LTTNG_SYMBOL_NAME_LEN);
      |                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Change-Id: I78eea760b4684227ee457c3368c6397d0a767af5
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
2 years agofix: warning '-Wformat-overflow' with GCC 11.2
Michael Jeanson [Mon, 7 Mar 2022 16:28:19 +0000 (11:28 -0500)] 
fix: warning '-Wformat-overflow' with GCC 11.2

Building with GCC 11.2 results in the following warning :

  In file included from rotation-thread.cpp:11:
  In function 'int handle_job_queue(rotation_thread_handle*, rotation_thread*, rotation_thread_timer_queue*)',
      inlined from 'void* thread_rotation(void*)' at rotation-thread.cpp:844:27:
  ../../../src/common/error.h:139:32: warning: '%s' directive argument is null [-Wformat-overflow=]
    139 |                         fprintf((type) == PRINT_MSG ? stdout : stderr, fmt, ## args);   \
        |                         ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  ../../../src/common/error.h:155:25: note: in expansion of macro '__lttng_print'
    155 |                         __lttng_print(type,                             \
        |                         ^~~~~~~~~~~~~
  ../../../src/common/error.h:195:27: note: in expansion of macro '_ERRMSG'
    195 | #define DBG(fmt, args...) _ERRMSG("DBG1", PRINT_DBG, fmt, ## args)
        |                           ^~~~~~~
  rotation-thread.cpp:587:25: note: in expansion of macro 'DBG'
    587 |                         DBG("Session \"%s\" not found",
        |                         ^~~

Use an empty string for the format string if 'session->name' is NULL.

Change-Id: Ibe29b43c0e8afd13b1c28770e8f7451340cc1e81
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
2 years agoCleanup: DIST_SUBDIRS is redundant when using AM conditionals
Michael Jeanson [Mon, 22 Nov 2021 19:43:09 +0000 (14:43 -0500)] 
Cleanup: DIST_SUBDIRS is redundant when using AM conditionals

From automake's documentation[1] :

  If SUBDIRS is defined conditionally using Automake conditionals,
  Automake will define DIST_SUBDIRS automatically from the possible
  values of SUBDIRS in all conditions.

[1] https://www.gnu.org/software/automake/manual/html_node/SUBDIRS-vs-DIST_005fSUBDIRS.html

Change-Id: I8495f1f4452ccde4920ecd63bfd37de4eb10c281
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
2 years agoFix: relayd: session id is ignored by 2.11+ create session command
Jérémie Galarneau [Thu, 10 Mar 2022 22:46:31 +0000 (17:46 -0500)] 
Fix: relayd: session id is ignored by 2.11+ create session command

The id of the session used by the sessiond is not returned by
cmd_create_session_2_11 and its caller sets the value in the
relay_session to an uninitialized value.

Up until recently this didn't have much effect as this uninitialized
value was stored and used to perform look-ups in the trace chunk
registry, which would work.

However, the recent multi-consumer rotation fixes make this problem more
significant as this 'id' is used as a key to join relay sessions
originating from the same session daemon.

This was discovered by enabling the '-Wunused-parameter' warning.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I7e33f5f93dc46bb630c431408c9472be3a75c030

2 years agoFix: consumerd: use-after-free of metadata bucket
Jérémie Galarneau [Thu, 3 Mar 2022 00:27:31 +0000 (19:27 -0500)] 
Fix: consumerd: use-after-free of metadata bucket

Observed issue
==============

When consumer_stream_destroy() is called from, for example, the error
path in setup_metadata(), consumer_stream_free() can end up being called
twice on the same stream.  Since the stream->metadata_bucket is not set
to NULL after being destroyed, it leads to a use-after-free:

 ERROR: AddressSanitizer: heap-use-after-free on address 0x604000000318
 READ of size 8 at 0x604000000318 thread T7
     #0 in metadata_bucket_destroy
     #1 in consumer_stream_free
     #2 in consumer_stream_destroy
     #3 in setup_metadata
     #4 in lttng_ustconsumer_recv_cmd
     #5 in lttng_consumer_recv_cmd
     #6 in consumer_thread_sessiond_poll
     #7 in start_thread nptl/pthread_create.c:481
     #8 in clone (/lib/x86_64-linux-gnu/libc.so.6+0xfcbde)

 0x604000000318 is located 8 bytes inside of 48-byte region [0x604000000310,0x604000000340)
 freed by thread T7 here:
     #0 in __interceptor_free
     #1 in metadata_bucket_destroy
     #2 in consumer_stream_free
     #3 in consumer_stream_destroy
     #4 in clean_channel_stream_list
     #5 in consumer_del_channel
     #6 in consumer_stream_destroy
     #7 in setup_metadata
     #8 in lttng_ustconsumer_recv_cmd
     #9 in lttng_consumer_recv_cmd
     #10 in consumer_thread_sessiond_poll
     #11 in start_thread nptl/pthread_create.c:481

 previously allocated by thread T7 here:
     #0 in __interceptor_calloc
     #1 in zmalloc
     #2 in metadata_bucket_create
     #3 in consumer_stream_enable_metadata_bucketization
     #4 in lttng_ustconsumer_set_stream_ops
     #5 in lttng_ustconsumer_on_recv_stream
     #6 in lttng_consumer_on_recv_stream
     #7 in create_ust_streams
     #8 in ask_channel
     #9 in lttng_ustconsumer_recv_cmd
     #10 in lttng_consumer_recv_cmd
     #11 in consumer_thread_sessiond_poll
     #12 in start_thread nptl/pthread_create.c:481

 Thread T7 created by T0 here:
     #0 in __interceptor_pthread_create
     #1 in main
     #2 in __libc_start_main ../csu/libc-start.c:332

 SUMMARY: AddressSanitizer: heap-use-after-free in metadata_bucket_destroy

This can be easily reproduced by forcing a failure during the setup
of the metadata reproducible using the following change:

  diff --git a/src/common/ust-consumer/ust-consumer.c b/src/common/ust-consumer/ust-consumer.c
  index fa1c71299..97ed59632 100644

  --- a/src/common/ust-consumer/ust-consumer.c
  +++ b/src/common/ust-consumer/ust-consumer.c
  @@ -908,8 +908,7 @@ static int setup_metadata(struct lttng_consumer_local_data *ctx, uint64_t key)

           /* Send metadata stream to relayd if needed. */
           if (metadata->metadata_stream->net_seq_idx != (uint64_t) -1ULL) {
  -                ret = consumer_send_relayd_stream(metadata->metadata_stream,
  -                                metadata->pathname);
  +                ret = -1;
                   if (ret < 0) {
                           ret = LTTCOMM_CONSUMERD_ERROR_METADATA;
                           goto error;

Cause
=====

Channels have a list of streams that are being "setup" and are not
yet monitored for consumption. During this setup phase, the streams are
owned by the channel. On destruction of the channel, any stream in that
list will thus be cleaned-up.

When destroying a consumer stream, a reference to its channel is 'put'.
This can result in the destruction of the channel.

In the situation described above, the release of the channel's reference
is done before the stream is removed from the channel's stream list.
This causes the channel's clean-up to invoke (again) the current
stream's clean-up, resulting in the double-free of the metadata bucket.

This problem is present in a number of error paths.

Solution
========

Some error paths already manually removed the consumer stream from it's
channel's stream list before invoking consumer_stream_destroy(). The
various error paths that have to deal with this possible situation are
changed to simply invoke consumer_stream_destroy().

consumer_stream_destroy() is modified to always remove the stream from
its channel's list before performing the rest of the clean-up. This
ensures that those double clean-ups can't occur.

Drawbacks
=========

None.

Reported-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
Tested-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ibeca9b675b86fc46be3f57826f7158de4da43df8

2 years agoFix: ust-consumerd: leak of stream control structure
Jérémie Galarneau [Thu, 3 Mar 2022 22:52:33 +0000 (17:52 -0500)] 
Fix: ust-consumerd: leak of stream control structure

The following leak is reported by LeakSanitizer when
setup_metadata() fails to send the metadata stream to the relay
daemon:

  ==3050181==ERROR: LeakSanitizer: detected memory leaks

  Direct leak of 240 byte(s) in 5 object(s) allocated from:
      #0 0x7f5fce02cfb9 in __interceptor_calloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
      #1 0x7f5fcdd95a7a in zmalloc ../../../src/common/macros.h:23
      #2 0x7f5fcdd95a7a in lttng_ust_ctl_create_stream /home/jgalar/EfficiOS/src/lttng-ust/src/lib/lttng-ust-ctl/ustctl.c:1649

A consumer stream can have an allocated
`struct lttng_ust_ctl_consumer_stream *` (ustream) even if it is
not globally visible at the time of its teardown.

In the case of the user space consumer, the only site that creates
consumer stream instances ensures that the allocation of the
lttng_ust_ctl_consumer_stream succeeded, ensuring that the
consumer stream's 'ustream' is always set.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ia4be7969e85bd8812ae13b042e1e100812a63c1d

2 years agoFix: liblttng-ctl: erroneous flat size computation
Jérémie Galarneau [Fri, 4 Mar 2022 20:29:12 +0000 (15:29 -0500)] 
Fix: liblttng-ctl: erroneous flat size computation

compute_flattened_size() erroneously computes (over-estimates) the size
of the allocation required to hold the flat array of struct lttng_event
returned to the user by lttng_list_{events, syscalls, tracepoints}.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I0a80ef0fa66428b7df34303804a024e80b635c69

2 years agofix: msgpack requires limits.h for UINT_MAX
Michael Jeanson [Thu, 5 Aug 2021 20:49:26 +0000 (16:49 -0400)] 
fix: msgpack requires limits.h for UINT_MAX

Building with '-Wundef' reveals this issue :

  unpack.c: In function ‘template_callback_array’:
  unpack.c:197:17: warning: "UINT_MAX" is not defined, evaluates to 0 [-Wundef]
    197 | #if SIZE_MAX == UINT_MAX
        |                 ^~~~~~~~
  unpack.c: In function ‘template_callback_map’:
  unpack.c:241:17: warning: "UINT_MAX" is not defined, evaluates to 0 [-Wundef]
    241 | #if SIZE_MAX == UINT_MAX
        |                 ^~~~~~~~

Change-Id: I7dadd9f7013d613509f66e67ff1beb8ae593d2bf
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
2 years agoSplit warning flags list for C and C++
Michael Jeanson [Thu, 3 Mar 2022 23:28:26 +0000 (18:28 -0500)] 
Split warning flags list for C and C++

When using Ccache [1], some flags specific to C are accepted by the C++
compiler but result in warning messages on each invocation of the
compiler. To remediate this, split the warning flags detection list in
three, a common base and a specific list for C and C++.

[1] https://github.com/ccache/ccache/issues/738

Change-Id: I9ef360efbfae445845ca1016e5f5eebdd3bdb0ac
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
2 years agoTests: add a multi-domain ust+kernel rotation test
Jérémie Galarneau [Thu, 13 Jan 2022 20:38:06 +0000 (15:38 -0500)] 
Tests: add a multi-domain ust+kernel rotation test

Validate that multi-domain rotations work as intended for both local and
remote outputs. This validates the fix introduced by c5c793.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I91ddd313fcfbc5421db281baea34ba2d0aae2c82

2 years agoFix: relayd: connection abruptly closed on viewer stream creation failure
Jérémie Galarneau [Wed, 2 Mar 2022 17:59:17 +0000 (12:59 -0500)] 
Fix: relayd: connection abruptly closed on viewer stream creation failure

Commit fe88e5175 explains (and fixes) an issue that could cause the
creation of viewer streams to fail. Currently, the error path causes the
relay daemon to abruptly close the connection to its live viewer peer.
This behaviour makes it impossible for the viewer to determine if an
error occurred or if the network connection simply failed.

Returning an `LTTNG_VIEWER_NEW_STREAMS_ERR` status code allows the
viewer to report a precise error. The viewer connection is closed since
the internal error is unlikely to be recoverable.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I557a8b66c1bd1c0bf361cfbabe962d8a6808f4f4

2 years agoFix: relayd: live client fails on clear of multi-domain session
Jérémie Galarneau [Wed, 2 Mar 2022 17:37:39 +0000 (12:37 -0500)] 
Fix: relayd: live client fails on clear of multi-domain session

Observed issue
==============

Two test cases of the clear/test_ust test suite occasionally fail in the
integration jobs testing cross-bitness (32/64) LTTng deployments.

Babeltrace fails with the following error when a clear occurs while a
live client consumes a trace:

  02-28 16:55:03.262 32362 32362 E PLUGIN/SRC.CTF.LTTNG-LIVE/VIEWER lttng_live_recv@viewer-connection.c:198 [lttng-live] Remote side has closed connection
  02-28 16:55:03.262 32362 32362 E PLUGIN/SRC.CTF.LTTNG-LIVE/VIEWER lttng_live_session_get_new_streams@viewer-connection.c:1706 [lttng-live] Error receiving get new streams reply
  02-28 16:55:03.262 32362 32362 E PLUGIN/SRC.CTF.LTTNG-LIVE lttng_live_msg_iter_next@lttng-live.c:1665 [lttng-live] Error preparing the next batch of messages: live-iter-status=LTTNG_LIVE_ITERATOR_STATUS_ERROR
  02-28 16:55:03.262 32362 32362 W LIB/MSG-ITER bt_message_iterator_next@iterator.c:864 Component input port message iterator's "next" method failed: iter-addr=0x55eab7eb1170, iter-upstream-comp-name="lttng-live", iter-upstream-comp-log-level=WARNING, iter-upstream-comp-class-type=SOURCE, iter-upstream-comp-class-name="lttng-live", iter-upstream-comp-class-partial-descr="Connect to an LTTng relay daemon", iter-upstream-port-type=OUTPUT, iter-upstream-port-name="out", status=ERROR
  02-28 16:55:03.262 32362 32362 E PLUGIN/FLT.UTILS.MUXER muxer_upstream_msg_iter_next@muxer.c:454 [muxer] Upstream iterator's next method returned an error: status=ERROR
  02-28 16:55:03.262 32362 32362 E PLUGIN/FLT.UTILS.MUXER validate_muxer_upstream_msg_iters@muxer.c:991 [muxer] Cannot validate muxer's upstream message iterator wrapper: muxer-msg-iter-addr=0x55eab7eb1120, muxer-upstream-msg-iter-wrap-addr=0x55eab7eb3a70
  02-28 16:55:03.262 32362 32362 E PLUGIN/FLT.UTILS.MUXER muxer_msg_iter_next@muxer.c:1415 [muxer] Cannot get next message: comp-addr=0x55eab7eb0470, muxer-comp-addr=0x55eab7eb0510, muxer-msg-iter-addr=0x55eab7eb1120, msg-iter-addr=0x55eab7eb0fb0, status=ERROR
  02-28 16:55:03.262 32362 32362 W LIB/MSG-ITER bt_message_iterator_next@iterator.c:864 Component input port message iterator's "next" method failed: iter-addr=0x55eab7eb0fb0, iter-upstream-comp-name="muxer", iter-upstream-comp-log-level=WARNING, iter-upstream-comp-class-type=FILTER, iter-upstream-comp-class-name="muxer", iter-upstream-comp-class-partial-descr="Sort messages from multiple inpu", iter-upstream-port-type=OUTPUT, iter-upstream-port-name="out", status=ERROR
  02-28 16:55:03.262 32362 32362 W LIB/GRAPH consume_graph_sink@graph.c:473 Component's "consume" method failed: status=ERROR, comp-addr=0x55eab7eb0760, comp-name="pretty", comp-log-level=WARNING, comp-class-type=SINK, comp-class-name="pretty", comp-class-partial-descr="Pretty-print messages (`text` fo", comp-class-is-frozen=1, comp-class-so-handle-addr=0x55eab7ebd910, comp-class-so-handle-path="/root/workspace/joraj_integration_base_job/deps-64/build/lib/babeltrace2/plugins/babeltrace-plugin-text.so", comp-input-port-count=1, comp-output-port-count=0
  02-28 16:55:03.262 32362 32362 E CLI cmd_run@babeltrace2.c:2548 Graph failed to complete successfully

  ERROR:    [Babeltrace CLI] (babeltrace2.c:2548)
    Graph failed to complete successfully
  CAUSED BY [libbabeltrace2] (graph.c:473)
    Component's "consume" method failed: status=ERROR, comp-addr=0x55eab7eb0760,
    comp-name="pretty", comp-log-level=WARNING, comp-class-type=SINK,
    comp-class-name="pretty", comp-class-partial-descr="Pretty-print messages
    (`text` fo", comp-class-is-frozen=1, comp-class-so-handle-addr=0x55eab7ebd910,
    comp-class-so-handle-path="/root/workspace/joraj_integration_base_job/deps-64/build/lib/babeltrace2/plugins/babeltrace-plugin-text.so",
    comp-input-port-count=1, comp-output-port-count=0
  CAUSED BY [libbabeltrace2] (iterator.c:864)
    Component input port message iterator's "next" method failed:
    iter-addr=0x55eab7eb0fb0, iter-upstream-comp-name="muxer",
    iter-upstream-comp-log-level=WARNING, iter-upstream-comp-class-type=FILTER,
    iter-upstream-comp-class-name="muxer",
    iter-upstream-comp-class-partial-descr="Sort messages from multiple inpu",
    iter-upstream-port-type=OUTPUT, iter-upstream-port-name="out", status=ERROR
  CAUSED BY [muxer: 'filter.utils.muxer'] (muxer.c:991)
    Cannot validate muxer's upstream message iterator wrapper:
    muxer-msg-iter-addr=0x55eab7eb1120,
    muxer-upstream-msg-iter-wrap-addr=0x55eab7eb3a70
  CAUSED BY [muxer: 'filter.utils.muxer'] (muxer.c:454)
    Upstream iterator's next method returned an error: status=ERROR
  CAUSED BY [libbabeltrace2] (iterator.c:864)
    Component input port message iterator's "next" method failed:
    iter-addr=0x55eab7eb1170, iter-upstream-comp-name="lttng-live",
    iter-upstream-comp-log-level=WARNING, iter-upstream-comp-class-type=SOURCE,
    iter-upstream-comp-class-name="lttng-live",
    iter-upstream-comp-class-partial-descr="Connect to an LTTng relay daemon",
    iter-upstream-port-type=OUTPUT, iter-upstream-port-name="out", status=ERROR
  CAUSED BY [lttng-live: 'source.ctf.lttng-live'] (lttng-live.c:1665)
    Error preparing the next batch of messages:
    live-iter-status=LTTNG_LIVE_ITERATOR_STATUS_ERROR
  CAUSED BY [lttng-live: 'source.ctf.lttng-live'] (viewer-connection.c:1706)
    Error receiving get new streams reply
  CAUSED BY [lttng-live: 'source.ctf.lttng-live'] (viewer-connection.c:198)
    Remote side has closed connection

Looking at the relay daemon logs, we see the following error:
  DBG1 - 16:55:03.262106718 [32139/32146]: Adding new file "ust/pid/gen-ust-events-32373-20220228-165503/chan_0" to trace chunk "(unnamed)" (in lttng_trace_chunk_add_file() at trace-chunk.cpp:1310)
  PERROR - 16:55:03.262133333 [32139/32146]: Failed to open fs handle to ust/pid/gen-ust-events-32373-20220228-165503/chan_0, open() returned: No such file or directory (in fd_tracker_open_fs_handle() at fd-tracker/fd-tracker.cpp:548)

Cause
=====

Adding more debugging logging allows us to see the following situation
takes place:

- relay thread: Create trace chunk on session 1.
- live thread: get new streams against session 1, returns NO_NEW_STREAMS
  since the session has an 'ongoing_rotation'.
- live thread: get new streams against session 2, sees no rotation
  ongoing and attempts to open `chan_0` when creating a viewer stream

The "ongoing rotation" check was introduced in a7ceb342d and, in a
nutshell, prevents live viewers from creating new viewer streams during
a rotation.

The "ongoing rotation" state is entered when a CREATE_NEW_TRACE_CHUNK
command is issued against a session.

However, this presumes that a relay_session maps 1:1 to a session on the
session daemon's end. This isn't the case as, in multi-domain
scenarios (tracing 32-bit, 64-bit, and kernel events), a single session
daemon session can map to multiple relay_session objects. This is
because the consumer daemons maintain independant connections to the
relay daemon.

To synchronize rotations accross related relay_session instances, the
relay daemon uses the same trace chunk instances accross relay_session
instances. This means that while a trace chunk is created against a
specific relay session, it can be used by other relay_session instances.

To manage shared trace chunks between relay_sessions, the relay daemon
makes use of the trace_chunk_registry. This registry allows
relay_sessions to share trace chunk instances using a unique key tuple:
  - session daemon instance uuid,
  - session daemon session id,
  - trace chunk id.

There is no equivalent mechanism to track the "ongoing_rotation" state
accross relay_sessions originating from the same sessiond session.

In the current scenario, this causes the live client to correctly see
that no new streams are available for session 1 (say, the 32-bit user
space session). Unfortunately, this state is not entered for other
sessions (64-bit and kernel relay sessions). Hence, the viewer succeds
in acquiring new streams from session 2, exposing the race the 'ongoing
rotation' state aims to protect against.

Solution
========

Like the trace chunk instances, the "ongoing rotation" state must be
shared accross relay sessions that originate from the same session
daemon session.

To "emulate" this shared state, session_has_ongoing_rotation() checks
if any relay session originating from the same sessiond session
have an ongoing rotation. If it is the case, we temporarily prevent
live viewers from acquiring new streams.

Known drawbacks
===============

session_has_ongoing_rotation() iterates over all sessions, acquiring
their lock in the process, which is certainly undesirable from a
performance standpoint.

Optimizing this is not a great challenge, but is beyond the scope
of this immediate fix.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I457a32fa497d151ca933c25433c80665268a7c1c

2 years agoFix: rotation: hang on destroy when using scheduled rotation based on timer
Jonathan Rajotte [Mon, 14 Feb 2022 16:23:28 +0000 (11:23 -0500)] 
Fix: rotation: hang on destroy when using scheduled rotation based on timer

Observed issue
==============

The following scenario results in a hang for `lttng destroy`:

lttng create test
lttng enable-event -u -a
lttng enable-rotation --timer 100000
lttng start
lttng stop
lttng start
lttng destroy

Cause
=====

There is an imbalance in how many times we start the rotation timer.

The rotation timer is only removed on `lttng destroy` or when disabling
a time-based-rotation. On the other hand, the timer is "started"
on `lttng start` and when enabling a time based rotation.

The imbalance emerging from a start/stop/start sequence would prevent the
teardown of the session object since each time the timer is started a
reference to the session is held.

Solution
========

Do not start the rotation schedule timer if it was already launched.

Known drawbacks
=========

None.

Change-Id: Ic5b8938166358fe7629187bebdf02a09e90846c0
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
This page took 0.059245 seconds and 4 git commands to generate.