| 1 | --- |
| 2 | id: nuts-and-bolts |
| 3 | --- |
| 4 | |
| 5 | What is LTTng? As its name suggests, the |
| 6 | _Linux Trace Toolkit: next generation_ is a modern toolkit for |
| 7 | tracing Linux systems and applications. So your first question might |
| 8 | rather be: **what is tracing?** |
| 9 | |
| 10 | As the history of software engineering progressed and led to what |
| 11 | we now take for granted—complex, numerous and |
| 12 | interdependent software applications running in parallel on |
| 13 | sophisticated operating systems like Linux—the authors of such |
| 14 | components, or software developers, began feeling a natural |
| 15 | urge of having tools to ensure the robustness and good performance |
| 16 | of their masterpieces. |
| 17 | |
| 18 | One major achievement in this field is, inarguably, the |
| 19 | <a href="https://www.gnu.org/software/gdb/" class="ext">GNU debugger |
| 20 | (GDB)</a>, which is an essential tool for developers to find and fix |
| 21 | bugs. But even the best debugger won't help make your software run |
| 22 | faster, and nowadays, faster software means either more work done by |
| 23 | the same hardware, or cheaper hardware for the same work. |
| 24 | |
| 25 | A _profiler_ is often the tool of choice to identify performance |
| 26 | bottlenecks. Profiling is suitable to identify _where_ performance is |
| 27 | lost in a given software; the profiler outputs a profile, a |
| 28 | statistical summary of observed events, which you may use to know |
| 29 | which functions took the most time to execute. However, a profiler |
| 30 | won't report _why_ some identified functions are the bottleneck. |
| 31 | Also, bottlenecks might only occur when specific conditions are met. |
| 32 | For a thorough investigation of software performance issues, a history |
| 33 | of execution, with historical values of chosen variables, is |
| 34 | essential. This is where tracing comes in handy. |
| 35 | |
| 36 | _Tracing_ is a technique used to understand what goes on in a running |
| 37 | software system. The software used for tracing is called a _tracer_, |
| 38 | which is conceptually similar to a tape recorder. When recording, |
| 39 | specific points placed in the software source code generate events |
| 40 | that are saved on a giant tape: a _trace_ file. Both user applications |
| 41 | and the operating system may be traced at the same time, opening the |
| 42 | possibility of resolving a wide range of problems that are otherwise |
| 43 | extremely challenging. |
| 44 | |
| 45 | Tracing is often compared to _logging_. However, tracers and loggers |
| 46 | are two different types of tools, serving two different purposes. Tracers are |
| 47 | designed to record much lower-level events that occur much more |
| 48 | frequently than log messages, often in the thousands per second range, |
| 49 | with very little execution overhead. Logging is more appropriate for |
| 50 | very high-level analysis of less frequent events: user accesses, |
| 51 | exceptional conditions (errors and warnings, for example), database |
| 52 | transactions, instant messaging communications, etc. More formally, |
| 53 | logging is one of several use cases that can be accomplished with |
| 54 | tracing. |
| 55 | |
| 56 | The list of recorded events inside a trace file may be read manually |
| 57 | like a log file for the maximum level of detail, but it is generally |
| 58 | much more interesting to perform application-specific analyses to |
| 59 | produce reduced statistics and graphs that are useful to resolve a |
| 60 | given problem. Trace viewers and analysers are specialized tools which |
| 61 | achieve this. |
| 62 | |
| 63 | So, in the end, this is what LTTng is: a powerful, open source set of |
| 64 | tools to trace the Linux kernel and user applications. LTTng is |
| 65 | composed of several components actively maintained and developed by |
| 66 | its <a href="/community/#where" class="ext">community</a>. |
| 67 | |
| 68 | Excluding proprietary solutions, a few competing software tracers |
| 69 | exist for Linux. |
| 70 | <a href="https://www.kernel.org/doc/Documentation/trace/ftrace.txt" class="ext">ftrace</a> |
| 71 | is the de facto function tracer of the Linux kernel. |
| 72 | <a href="http://linux.die.net/man/1/strace" class="ext">strace</a> |
| 73 | is able to record all system calls made by a user process. |
| 74 | <a href="https://sourceware.org/systemtap/" class="ext">SystemTap</a> |
| 75 | is a Linux kernel and user space tracer which uses custom user scripts |
| 76 | to produce plain text traces. |
| 77 | <a href="http://www.sysdig.org/" class="ext">sysdig</a> |
| 78 | also uses scripts, written in Lua, to trace and analyze the Linux |
| 79 | kernel. |
| 80 | |
| 81 | The main distinctive features of LTTng is that it produces correlated |
| 82 | kernel and user space traces, as well as doing so with the lowest |
| 83 | overhead amongst other solutions. It produces trace files in the |
| 84 | <a href="http://www.efficios.com/ctf" class="ext"><abbr title="Common Trace Format">CTF</abbr></a> |
| 85 | format, an optimized file format for production and analyses of |
| 86 | multi-gigabyte data. LTTng is the result of close to 10 years of |
| 87 | active development by a community of passionate developers. It is |
| 88 | currently available on all major desktop, server, and embedded Linux |
| 89 | distributions. |
| 90 | |
| 91 | The main interface for tracing control is a single command line tool |
| 92 | named `lttng`. The latter can create several tracing sessions, |
| 93 | enable/disable events on the fly, filter them efficiently with custom |
| 94 | user expressions, start/stop tracing and do much more. Traces can be |
| 95 | recorded on disk or sent over the network, kept totally or partially, |
| 96 | and viewed once tracing is inactive or in real-time. |
| 97 | |
| 98 | [Install LTTng now](#doc-installing-lttng) and start tracing! |