Fix: relayd: leaked socket for live connections
authorKienan Stewart <kstewart@efficios.com>
Thu, 17 Oct 2024 14:40:48 +0000 (10:40 -0400)
committerJérémie Galarneau <jeremie.galarneau@efficios.com>
Thu, 24 Oct 2024 18:20:27 +0000 (14:20 -0400)
Observed issue
==============

While exploring an issue where Python programs using the `bt2` module
would crash on shutdown, it was noticed that the same case also
highlighted a leaked file descriptor.

The following Python program may be used to demonstrate the leak.

```

import os
import socket
import time

import bt2

def is_connected():
    ctf_live_cc = bt2.find_plugin("ctf").source_component_classes["lttng-live"]
    q = bt2.QueryExecutor(ctf_live_cc, "sessions", params={"url": "net://localhost"})
    connected = False
    try:
        for x in q.query():
            print(x)
            if x['session-name'] == 'test' and x['client-count'] >= 1:
                connected = True
                break
    except Exception as e:
        print(e)
    return connected

os.system("lttng create test --live")
os.system("lttng enable-event -u --all")
os.system("lttng start")

ctf_live_cc = bt2.find_plugin("ctf").source_component_classes["lttng-live"]
iterator = bt2.TraceCollectionMessageIterator(bt2.ComponentSpec(ctf_live_cc, {'inputs': ["net://localhost/host/{}/test".format(socket.gethostname())], 'session-not-found-action': 'end'}))

data = []
while not is_connected():
    try:
        data.append(next(iterator))
    except Exception as e:
        print(e)
    time.sleep(0.1)

os.system("lttng stop")
os.system("lttng destroy")
os.system("killall lttng-sessiond")
os.system("killall lttng-relayd")
```

Cause
=====

During the clean-up at the end of the live worker thread, all remaining
viewer connections have their references put (which should cause the
connection to be released).

When releasing connections, `close` is never called on the socket's file
descriptor. Furthermore, the release doesn't modify `the_fd_tracker`.

Solution
========

Explicitly close and stop tracking the sockets for viewer connections
just prior to the final put.

Known drawbacks
===============

None.

Change-Id: I5eb0a3e5b9cb14dc1199be463bc312cbc72d8244
Signed-off-by: Kienan Stewart <kstewart@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
src/bin/lttng-relayd/live.cpp

index 6426330f5a97298766cff5594e634d815ca71c58..3857fbced8693d44b02d350df6e586d29700dab6 100644 (file)
@@ -2880,6 +2880,8 @@ error:
                                                 &relay_connection::sock_n>(
                     *viewer_connections_ht->ht)) {
                health_code_update();
+               fd_tracker_close_unsuspendable_fd(
+                       the_fd_tracker, &destroy_conn->sock->fd, 1, close_sock, destroy_conn->sock);
                connection_put(destroy_conn);
        }
 
This page took 0.026692 seconds and 4 git commands to generate.