Fix: sessiond: uninitialized ust_app event_notifier_group
Observed issue
==============
On SLES12SP5 (gcc 4.8.5), occasional crashes were observed with the
following log entry:
```
DBG1 - 18:49:41.
084261033 [Notification]: Poll wait returned (1) (in thread_notification() at notification-thread.cpp:632)
DBG1 - 18:49:41.
084264635 [Notification]: Handling fd (28) activity (1) (in thread_notification() at notification-thread.cpp:656)
DBG1 - 18:49:41.
084269451 [Notification]: Received `REMOVE_TRACER_EVENT_SOURCE` command (in handle_notification_thread_command() at notification-thread-events.cpp:3178)
lttng-sessiond: notification-thread-events.cpp:2228: int
handle_notification_thread_command_remove_tracer_event_source(notification_thread_state*,
int, lttng_error_code*): Assertion `source_element' failed
```
The easiest reproduce was to comment out most all tests in
`regression/ust/ust-app-ctl-paths/test_ust_app_ctl_paths` except
`test_trace_self_default_paths`. Then run a loop until a failure is
detected, e.g.
```
make -j4 && \
ITER=0 && \
while true ; do LTTNG_UST_DEBUG=1 LTTNG_TEST_LOG_DIR=- LTTNG_TEST_VERBOSE_SESSIOND=1 ./tests/regression/ust/ust-app-ctl-paths/test_ust_app_ctl_paths 2>&1 | tee sessiond.log ; if grep -qE '^not ok' sessiond.log ; then break; fi ; ITER=$((ITER+1)) ; echo "Done $ITER iterations" ; sleep 1 ; done
```
Cause
=====
After investigation, it appears to be that `lta = new ust_app;` may
occasionally return the address multiple times when previous
allocation is rapidly de-allocated.
For example, in the following snippet of a log tracking an application
address that shows re-use:
```
DBG1 - 19:47:29.
103043500 [UST registration dispatch]: Create new ust app allocation 0x7fce14001da0 for pid 23256 sock 53 (in ust_app_create() at ust-app.cpp:4061)
DBG1 - 19:47:29.
115347169 [UST registration dispatch]: wait_node_in_queue looked up app 0x7fce14001da0 pid 23256 (in thread_dispatch_ust_registration() at dispatch.cpp:362)
DBG1 - 19:47:29.
115933991 [UST registration dispatch]: ust app version app 0x7fce14001da0 pid 23256 (in thread_dispatch_ust_registration() at dispatch.cpp:392)
DBG1 - 19:47:29.
115954872 [UST registration dispatch]: Added ust app 0x7fce14001da0 pid 23256 (in thread_dispatch_ust_registration() at dispatch.cpp:403)
DBG1 - 19:47:29.
115958125 [UST registration dispatch]: ust_app_setup_event_notifier_group app 0x7fce14001da0 pid 23256 (in ust_app_setup_event_notifier_group() at ust-app.cpp:4226)
DBG1 - 19:47:29.
116367064 [UST registration dispatch]: app 0x7fce14001da0 add tracer event source in setup_event_notifier_group fd 60 ret 10 (in ust_app_setup_event_notifier_group() at ust-app.cpp:4279)
DBG1 - 19:47:29.
117653161 [UST registration dispatch]: app 0x7fce14001da0 pid 23256 return from setup trace event notifier group ret 0, event notifier group 0x7fce14003180 (in ust_app_setup_event_notifier_group() at ust-app.cpp:4313)
DBG1 - 19:47:29.
152058781 [23308/23332]: delete_ust_app 0x7fce14001da0 pid 23256 sock 53 (in delete_ust_app() at ust-app.cpp:1021)
DBG1 - 19:47:29.
152076408 [23308/23332]: No event notifier group object remove tracer event source fd 60 app 0x7fce14001da0 (in delete_ust_app() at ust-app.cpp:1075)
DBG1 - 19:47:29.
191427936 [UST registration dispatch]: Create new ust app allocation 0x7fce14001da0 for pid 23371 sock 71 (in ust_app_create() at ust-app.cpp:4061)
DBG1 - 19:47:29.
204450310 [UST registration dispatch]: wait node 0x7fce14001d10, app 0x7fce14001da0 (in sanitize_wait_queue() at dispatch.cpp:144)
DBG1 - 19:47:29.
204454373 [UST registration dispatch]: Culling app from wait queue: pid=23371 app 0x7fce14001da0 (in sanitize_wait_queue() at dispatch.cpp:146)
DBG1 - 19:47:29.
259291727 [23308/23332]: delete_ust_app 0x7fce14001da0 pid 23371 sock 71 (in delete_ust_app() at ust-app.cpp:1021)
DBG1 - 19:47:29.
590230014 [23308/23332]: No event notifier group object remove tracer event source fd 81 app 0x7fce14001da0 (in delete_ust_app() at ust-app.cpp:1075)
lttng-sessiond: notification-thread-events.cpp:2233: int handle_notification_thread_command_remove_tracer_event_source(notification_thread_state*, int, lttng_error_code*):
Assertion `source_element' failed.
```
As this variable is default initialized[1], the memory contents are
indeterminate. In `ust_app_create`, many fields are manually
initialized; however `lta->event_notifier_group.object` is not set
explicitly.
The next time `event_notifier_group.object` is set is when the the
`notification_thread_command_add_tracer_event_source` succeeds. If the
app dies before than, or if that registration fails, the
`ust_app.event_notifier_group.object` may be non-NULL. When this
object passes to `delete_ust_app`, the only check made before trying
to remove the tracer event
source (`notification_thread_command_remove_tracer_event_source`) is
if `event_notifier_group.object` is NULL or not.
If the file descriptor hasn't been previously registered, the assert
is triggered.
While I haven't confirmed my hypothesis, it is possible that the
behaviour of GCC regarding default initialization has changed between
GCC 4.8.5 and now.
Solution
========
Use an explicit initialization for the new allocation.
Known drawbacks
===============
None.
References
==========
[1]: https://en.cppreference.com/w/cpp/language/default_initialization (See case 2)
Change-Id: I12de1864d2995be014d5526283fb843d0f99093e
Signed-off-by: Kienan Stewart <kstewart@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>