Commit | Line | Data |
---|---|---|
c719e480 MD |
1 | LTTng modules design |
2 | --------------------- | |
3 | ||
4 | by Mathieu Desnoyers | |
5 | June 30, 2020 | |
6 | ||
7 | This document covers the high level design of lttng-modules. | |
8 | ||
9 | LTTng modules is a kernel tracer for the Linux kernel. It can be either | |
10 | loaded as a set of kernel modules, or built into a Linux kernel. | |
11 | ||
12 | Here are its key components: | |
13 | ||
14 | * LTTng modules ABI | |
15 | ||
16 | Files: | |
17 | - src/lttng-abi.c | |
18 | - include/lttng/abi.h | |
19 | ||
20 | This ABI consists of ioctls with code 0xF6. It extensively uses | |
21 | anonymous file descriptors to represent the tracer "objects". Only | |
22 | root is allowed to interact with those ioctls. | |
23 | ||
24 | ||
25 | * LTTng session, channels, contexts and events management | |
26 | - src/lttng-events.c | |
27 | - include/lttng/lttng-events.h | |
28 | ||
29 | Current state about configured tracing sessions, channels, contexts | |
30 | and events. The session, channel, context and event state is | |
31 | manipulated through the LTTng modules ABI. A session contains 0 or | |
32 | more channels, through which data is traced. A channel is associated | |
33 | with an instance of a lib ring buffer client. Channels have 0 or more | |
34 | events, which are associated to kernel instrumentation as event | |
35 | sources. | |
36 | ||
37 | ||
38 | * lib ring buffer | |
39 | ||
40 | Generic ring buffer library (kernel implementation). Note, there is | |
41 | a very similar copy of this implementation within the lttng-ust | |
42 | user-space tracer. The overall goal of this library is to support | |
43 | both kernel and user-space tracing. | |
44 | ||
45 | Files: | |
46 | - src/lib/ringbuffer/* | |
47 | - include/ringbuffer/* | |
48 | ||
49 | Those include ring buffer ABI meant for consuming the buffer data | |
50 | from user-space. It is implemented in: | |
51 | ||
52 | - src/lib/ringbuffer/ring_buffer_vfs.c (open, release, poll, ioctl) | |
53 | - src/lib/ringbuffer/ring_buffer_mmap.c (mmap) | |
54 | - src/lib/ringbuffer/ring_buffer_splice.c (splice) | |
55 | - include/ringbuffer/vfs.h: lib ring buffer ioctl commands (code 0xF6). | |
56 | ||
57 | The ring buffer library can be configured to be used in various | |
58 | use-cases by creating a specialized ring buffer "client" (template). | |
238c45de MD |
59 | include/ringbuffer/config.h details the various configuration |
60 | parameters which are supported. | |
c719e480 MD |
61 | |
62 | ||
63 | * LTTng modules ring buffer clients | |
64 | ||
65 | Files: | |
66 | - src/lttng-ring-buffer-client-discard.c | |
67 | - src/lttng-ring-buffer-client-mmap-discard.c | |
68 | - src/lttng-ring-buffer-client-mmap-overwrite.c | |
69 | - src/lttng-ring-buffer-client-overwrite.c | |
70 | - src/lttng-ring-buffer-metadata-client.c | |
71 | - src/lttng-ring-buffer-metadata-mmap-client.c | |
72 | - src/lttng-ring-buffer-client.h | |
73 | - src/lttng-ring-buffer-metadata-client.h | |
74 | ||
75 | Those are the users of lib ring buffer, with specialized instances of | |
76 | the ring buffer for each use-case supported by LTTng. Those are | |
77 | hand-crafted templates in C. The fast-paths are inlined within each | |
78 | client, and the slow paths are kept in the common library to minimize | |
79 | code memory usage. | |
80 | ||
81 | ||
82 | * LTTng filter | |
83 | ||
84 | The filter in lttng-modules is meant to quickly discard events which | |
85 | do not match an expression. The expression parsing is all done in | |
86 | userspace within lttng-tools. The filter is received by lttng-modules | |
87 | as a bytecode. The frequent case for which a filter is optimized is to | |
88 | discard most of the events. The filter operates on input arguments | |
89 | received on the stack, before the ring buffer is touched. | |
90 | ||
91 | Files: | |
92 | - include/lttng/filter-bytecode.h: LTTng filter bytecode. | |
93 | - src/lttng-filter-validator.c: Validation pass on bytecode reception | |
94 | - src/lttng-filter.c: Filter linker code: link a bytecode onto a given | |
95 | event (knowing its fields offsets). | |
96 | - src/lttng-filter-specialize.c: Specialize the bytecode, transforming | |
97 | generic instructions into | |
98 | type-specific (faster) instructions. | |
99 | - src/lttng-filter-interpreter.c: Bytecode interpreter, called by | |
100 | instrumentation to filter events. | |
101 | ||
102 | * LTTng contexts | |
103 | ||
104 | LTTng-modules supports the notion of "contexts" which can be attached either | |
105 | to specific events or to all events in a channel. Those are additional | |
106 | data which can be saved prior to the event payload, e.g. current | |
107 | thread ID, process name, performance counters, and more. | |
108 | ||
109 | Files: | |
110 | - src/lttng-context.c: Context state associated to a channel or event, | |
111 | and helpers. | |
112 | - src/lttng-context-*.c: Implementation of all supported contexts: | |
113 | callstack, cgroup-ns, cpu-id, egid, euid, gid, hostname, | |
114 | interruptible, ipc-ns, migratable, mnt-ns, need-reschedule, net-ns, | |
115 | nice, perf-counters, pid, pis-ns, ppid, preemptible, prio, procname, | |
116 | sgid, suid, tid, uid, user-ns, uts-ns, vegid, veuid, vgid, vpid, vppid, | |
117 | vsgid, vtid, vuid. | |
118 | ||
119 | ||
120 | * LTTng tracepoint instrumentation | |
121 | ||
122 | The LTTng tracer attaches "probes" to kernel subsystems. A probe is a | |
123 | set of tracepoint callbacks matching the tracepoint instrumentation | |
124 | for a kernel subsystem. Each probe can be loaded separately. | |
125 | ||
126 | Due to limitations in the kernel TRACE_EVENT macros, LTTng | |
127 | implements its own LTTNG_TRACEPOINT_EVENT macros. It uses the | |
128 | upstream kernel TRACE_EVENT macros only to validate the prototype | |
129 | of its callbacks. Also, LTTng exposes an event field semantic which | |
130 | matches what is exposed to user-space through /proc in the traces, | |
131 | which requires different field layout implementation than what the | |
132 | upstream kernel exposes to user-space. | |
133 | ||
134 | Files: | |
135 | src/lttng-tracepoint.c: Mapping between tracepoint instrumentation and LTTng | |
136 | events. | |
137 | src/lttng-probes.c: LTTng probes registry. | |
138 | include/instrumentation/events/*: LTTng tracepoint instrumentation | |
139 | headers for all kernel subsystems. | |
140 | ||
141 | ||
142 | * LTTng system call instrumentation | |
143 | ||
144 | The LTTng tracer gathers both input and output arguments from each | |
145 | system call, for all supported architectures. This means the system | |
146 | call probe callbacks read from user-space memory when needed. | |
147 | ||
148 | Files: | |
149 | - src/lttng-syscalls.c: LTTng system call instrumentation callbacks and | |
150 | tables. | |
151 | - include/instrumentation/syscall/*: generated and override system | |
152 | call instrumentation headers. | |
153 | ||
154 | ||
155 | * LTTng statedump | |
156 | ||
157 | Dump kernel state at trace start or when an explicit "statedump" is | |
158 | requested. Useful to reconstruct the entire kernel state at | |
159 | post-processing. Dumps: threads scheduling state, file | |
160 | descriptor tables, interrupt handlers, network interfaces, block | |
161 | devices, cpu topology. Also performs a "fence" on all CPUs to reach | |
162 | a quiescent state on all CPUs before start and end of statedump. | |
163 | ||
164 | Files: | |
165 | - src/lttng-statedump-impl.c | |
166 | ||
167 | ||
168 | * LTTng tracker | |
169 | ||
170 | User ID and Process ID trackers, for filtering of entire sessions | |
171 | based on UID, GID, and PID. | |
172 | ||
173 | Files: | |
174 | - src/lttng-tracker-id.c | |
175 | ||
176 | ||
177 | * LTTng clock | |
178 | ||
179 | Clock plugin registration. The clock used by the LTTng modules kernel | |
180 | tracer can be overridden by a plugin module. | |
181 | ||
182 | Files: | |
183 | - src/lttng-clock.c | |
184 | - include/lttng/clock.h |