| 1 | --- |
| 2 | id: what-is-tracing |
| 3 | --- |
| 4 | |
| 5 | As the history of software engineering progressed and led to what |
| 6 | we now take for granted—complex, numerous and |
| 7 | interdependent software applications running in parallel on |
| 8 | sophisticated operating systems like Linux—the authors of such |
| 9 | components, or software developers, began feeling a natural |
| 10 | urge of having tools to ensure the robustness and good performance |
| 11 | of their masterpieces. |
| 12 | |
| 13 | One major achievement in this field is, inarguably, the |
| 14 | <a href="https://www.gnu.org/software/gdb/" class="ext">GNU debugger |
| 15 | (GDB)</a>, which is an essential tool for developers to find and fix |
| 16 | bugs. But even the best debugger won't help make your software run |
| 17 | faster, and nowadays, faster software means either more work done by |
| 18 | the same hardware, or cheaper hardware for the same work. |
| 19 | |
| 20 | A _profiler_ is often the tool of choice to identify performance |
| 21 | bottlenecks. Profiling is suitable to identify _where_ performance is |
| 22 | lost in a given software; the profiler outputs a profile, a |
| 23 | statistical summary of observed events, which you may use to discover |
| 24 | which functions took the most time to execute. However, a profiler |
| 25 | won't report _why_ some identified functions are the bottleneck. |
| 26 | Bottlenecks might only occur when specific conditions are met, sometimes |
| 27 | almost impossible to capture by a statistical profiler, or impossible to |
| 28 | reproduce with an application altered by the overhead of an event-based |
| 29 | profiler. For a thorough investigation of software performance issues, |
| 30 | a history of execution, with the recorded values of chosen variables |
| 31 | and context, is essential. This is where tracing comes in handy. |
| 32 | |
| 33 | _Tracing_ is a technique used to understand what goes on in a running |
| 34 | software system. The software used for tracing is called a _tracer_, |
| 35 | which is conceptually similar to a tape recorder. When recording, |
| 36 | specific probes placed in the software source code generate events |
| 37 | that are saved on a giant tape: a _trace_ file. Both user applications |
| 38 | and the operating system may be traced at the same time, opening the |
| 39 | possibility of resolving a wide range of problems that are otherwise |
| 40 | extremely challenging. |
| 41 | |
| 42 | Tracing is often compared to _logging_. However, tracers and loggers |
| 43 | are two different tools, serving two different purposes. Tracers are |
| 44 | designed to record much lower-level events that occur much more |
| 45 | frequently than log messages, often in the thousands per second range, |
| 46 | with very little execution overhead. Logging is more appropriate for |
| 47 | very high-level analysis of less frequent events: user accesses, |
| 48 | exceptional conditions (errors and warnings, for example), database |
| 49 | transactions, instant messaging communications, etc. More formally, |
| 50 | logging is one of several use cases that can be accomplished with |
| 51 | tracing. |
| 52 | |
| 53 | The list of recorded events inside a trace file may be read manually |
| 54 | like a log file for the maximum level of detail, but it is generally |
| 55 | much more interesting to perform application-specific analyses to |
| 56 | produce reduced statistics and graphs that are useful to resolve a |
| 57 | given problem. Trace viewers and analysers are specialized tools |
| 58 | designed to do this. |
| 59 | |
| 60 | So, in the end, this is what LTTng is: a powerful, open source set of |
| 61 | tools to trace the Linux kernel and user applications at the same time. |
| 62 | LTTng is composed of several components actively maintained and |
| 63 | developed by its <a href="/community/#where" class="ext">community</a>. |