| 1 | LTTng Relay Daemon Architecture |
| 2 | Mathieu Desnoyers, August 2015 |
| 3 | |
| 4 | This document describes the object model and architecture of the relay |
| 5 | daemon, after the refactoring done within the commit "Fix: Relay daemon |
| 6 | ownership and reference counting". |
| 7 | |
| 8 | We have the following object composition hierarchy: |
| 9 | |
| 10 | relay connection (main.c, for sessiond/consumer) |
| 11 | | |
| 12 | \-> 0 or 1 session |
| 13 | | |
| 14 | \-> 0 or many ctf-trace |
| 15 | | |
| 16 | \-> 0 or many stream |
| 17 | | | |
| 18 | | \-> 0 or many index |
| 19 | | |
| 20 | \-------> 0 or 1 viewer stream |
| 21 | |
| 22 | live connection (live.c, for client) |
| 23 | | |
| 24 | \-> 1 viewer session |
| 25 | | |
| 26 | \-> 0 or many session (actually a reference to session as created |
| 27 | | by the relay connection) |
| 28 | | |
| 29 | \-> ..... (ctf-trace, stream, index, viewer stream) |
| 30 | |
| 31 | There are global tables declared in lttng-relayd.h for sessions |
| 32 | (sessions_ht, indexed by session id), streams (relay_streams_ht, indexed |
| 33 | by stream handle), and viewer streams (viewer_streams_ht, indexed by |
| 34 | stream handle). The purpose of those tables is to allow fast lookup of |
| 35 | those objects using the IDs received in the communication protocols. |
| 36 | |
| 37 | There is also one connection hash table per worker thread. There is one |
| 38 | worker thread to receive data (main.c), and one worker thread to |
| 39 | interact with viewer clients (live.c). Those tables are indexed by |
| 40 | socket file descriptor. |
| 41 | |
| 42 | A RCU lookup+refcounting scheme has been introduced for all objects |
| 43 | (except viewer session which is still an exception at the moment). This |
| 44 | scheme allows looking up the objects or doing a traversal on the RCU |
| 45 | linked list or hash table in combination with a getter on the object. |
| 46 | This getter validates that there is still at least one reference to the |
| 47 | object, else the lookup acts just as if the object does not exist. This |
| 48 | scheme is protected by a "reflock" mutex in each object. "reflock" |
| 49 | mutexes can be nested from the innermost object to the outermost object. |
| 50 | IOW, the session reflock can nest within the ctf-trace reflock. |
| 51 | |
| 52 | The relay_connection (connection between the sessiond/consumer and the |
| 53 | relayd) is the outermost object of its hierarchy. |
| 54 | |
| 55 | The live connection (connection between a live client and the relayd) |
| 56 | is the outermost object of its hierarchy. |
| 57 | |
| 58 | There is also a "lock" mutex in each object. Those are used to |
| 59 | synchronize between threads (currently the main.c relay thread and |
| 60 | live.c client thread) when objects are shared. Locks can be nested from |
| 61 | the outermost object to the innermost object. IOW, the ctf-trace lock can |
| 62 | nest within the session lock. |
| 63 | |
| 64 | A "lock" should never nest within a "reflock". |
| 65 | |
| 66 | RCU linked lists are used to iterate using RCU, and are protected by |
| 67 | their own mutex for modifications. Iterations should be confirmed using |
| 68 | the object "getter" to ensure its refcount is not 0 (except in cases |
| 69 | where the caller actually owns the objects and therefore can assume its |
| 70 | refcount is not 0). |
| 71 | |
| 72 | RCU hash tables are used to iterate using RCU. Iteration should be |
| 73 | confirmed using the object "getter" to ensure its refcount is not 0 |
| 74 | (except again if we have ownership and can assume the object refcount is |
| 75 | not 0). |
| 76 | |
| 77 | Object creation has a refcount of 1. Each getter increments the |
| 78 | refcount, and needs to be paired with a "put" to decrement it. A final |
| 79 | put on "self" (ownership) will allow refcount to reach 0, therefore |
| 80 | triggering release, and thus free through call_rcu. |
| 81 | |
| 82 | In the composition scheme, we find back references from each composite |
| 83 | to its container. Therefore, each composite holds a reference (refcount) |
| 84 | on its container. This allows following pointers from e.g. viewer stream |
| 85 | to stream to ctf-trace to session without performing any validation, |
| 86 | due to transitive refcounting of those back-references. |
| 87 | |
| 88 | In addition to those back references, there are a few key ownership |
| 89 | references held. The connection in the relay worker thread (main.c) |
| 90 | holds ownership on the session, and on each stream it contains. The |
| 91 | connection in the live worker thread (live.c) holds ownership on each |
| 92 | viewer stream it creates. The rest is ensured by back references from |
| 93 | composite to container objects. When a connection is closed, it puts all |
| 94 | the ownership references it is holding. This will then eventually |
| 95 | trigger destruction of the session, streams, and viewer streams |
| 96 | associated with the connection when all the back references reach 0. |
| 97 | |
| 98 | RCU read-side locks are now only held during iteration on RCU lists and |
| 99 | hash tables, and within the internals of the get (lookup) and put |
| 100 | functions. Those functions then use refcounting to ensure existence of |
| 101 | the object when returned to their caller. |