Commit | Line | Data |
---|---|---|
5e0cbfb0 PP |
1 | --- |
2 | id: nuts-and-bolts | |
3 | --- | |
4 | ||
5 | What is LTTng? As its name suggests, the | |
6 | _Linux Trace Toolkit: next generation_ is a modern toolkit for | |
7 | tracing Linux systems and applications. So your first question might | |
8 | rather be: **what is tracing?** | |
9 | ||
10 | As the history of software engineering progressed and led to what | |
11 | we now take for granted—complex, numerous and | |
12 | interdependent software applications running in parallel on | |
13 | sophisticated operating systems like Linux—the authors of such | |
14 | components, or software developers, began feeling a natural | |
15 | urge of having tools to ensure the robustness and good performance | |
16 | of their masterpieces. | |
17 | ||
18 | One major achievement in this field is, inarguably, the | |
19 | <a href="https://www.gnu.org/software/gdb/" class="ext">GNU debugger | |
20 | (GDB)</a>, which is an essential tool for developers to find and fix | |
21 | bugs. But even the best debugger won't help make your software run | |
22 | faster, and nowadays, faster softwares means either more work done by | |
23 | the same hardware, or cheaper hardware for the same work. | |
24 | ||
25 | A _profiler_ is often the tool of choice to identify performance | |
26 | bottleneck. Profiling is suitable to identify _where_ performance is | |
27 | lost in a given software; the profiler outputs a profile, a | |
28 | statistical summary of observed events, which you may use to know | |
29 | which functions took the most time to execute. However, a profiler | |
30 | won't report _why_ some identified functions are the bottleneck. | |
31 | Also, bottlenecks might only occur when specific conditions are met. | |
32 | For a thorough investigation of software performance issues, a history | |
33 | of execution, with historical values of chosen variables, is | |
34 | essential. This is where tracing comes in handy. | |
35 | ||
36 | _Tracing_ is a technique used to understand what goes on in a running | |
37 | software system. The software used for tracing is called a _tracer_, | |
38 | which is conceptually similar to a tape recorder. When recording, | |
39 | specific points placed in the software source code generate events | |
40 | that are saved on a giant tape: a _trace_ file. Both user applications | |
41 | and the operating system may be traced at the same time, opening the | |
42 | possibility of resolving a wide range of problems that are otherwise | |
43 | extremely challenging. | |
44 | ||
45 | Tracing is often compared to _logging_. However, tracers and loggers | |
46 | are two different tools, serving two different purposes. Tracers are | |
47 | designed to record much lower-level events that occur much more | |
48 | frequently than log messages, often in the thousands per second range, | |
49 | with very little execution overhead. Logging is more appropriate for | |
50 | very high-level analysis of less frequent events: user accesses, | |
51 | exceptional conditions (e.g., errors, warnings), database | |
52 | transactions, instant messaging communications, etc. More formally, | |
53 | logging is one of several use cases that can be accomplished with | |
54 | tracing. | |
55 | ||
56 | The list of recorded events inside a trace file may be read manually | |
57 | like a log file for the maximum level of detail, but it is generally | |
58 | much more interesting to perform application-specific analyses to | |
59 | produce reduced statistics and graphs that are useful to resolve a | |
60 | given problem. Trace viewers and analysers are specialized tools which | |
61 | achieve this. | |
62 | ||
63 | So, in the end, this is what LTTng is: a powerful, open source set of | |
64 | tools to trace the Linux kernel and user applications. LTTng is | |
65 | composed of several components actively maintained and developed by | |
66 | its community. | |
67 | ||
68 | Excluding proprietary solutions, a few competing software tracers | |
69 | exist for Linux. | |
70 | <a href="https://www.kernel.org/doc/Documentation/trace/ftrace.txt" class="ext">ftrace</a> | |
71 | is the de facto function tracer of the Linux kernel. | |
72 | <a href="http://linux.die.net/man/1/strace" class="ext">strace</a> | |
73 | is able to record all system calls made by a user process. | |
74 | <a href="https://sourceware.org/systemtap/" class="ext">SystemTap</a> | |
75 | is a Linux kernel and user space tracer which uses custom user scripts | |
76 | to produce plain text traces. | |
77 | <a href="http://www.sysdig.org/" class="ext">sysdig</a> | |
78 | also uses scripts, written in Lua, to trace and analyze the Linux | |
79 | kernel. | |
80 | ||
81 | The main distinctive features of LTTng is that it produces correlated | |
82 | kernel and user space traces, as well as doing so with the lowest | |
83 | overhead amongst other solutions. It produces trace files in the | |
84 | <a href="http://www.efficios.com/ctf" class="ext"><abbr title="Common Trace Format">CTF</abbr></a> | |
85 | format, an optimized file format for production and analyses of | |
86 | multi-gigabyte data. LTTng is the result of close to 10 years of | |
87 | active development by a community of passionate developers. It is | |
88 | currently available on all major desktop and embedded Linux | |
89 | distributions. | |
90 | ||
91 | The main interface for tracing control is a single command line tool | |
92 | named `lttng`. The latter can create several tracing sessions, | |
93 | enable/disable events on the fly, filter them efficiently with custom | |
94 | user expressions, start/stop tracing and do much more. Traces can be | |
95 | recorded on disk or sent over the network, kept totally or partially, | |
96 | and viewed once tracing is inactive or in real-time. | |
97 | ||
98 | [Install LTTng now](#doc-installing-lttng) and start tracing! |