| 1 | <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> |
| 2 | <html> |
| 3 | <head> |
| 4 | <title>Linux Trace Toolkit trace analysis tools</title> |
| 5 | </head> |
| 6 | <body> |
| 7 | |
| 8 | <h1>Linux Trace Toolkit trace analysis tools</h1> |
| 9 | |
| 10 | <P>The Linux Trace Toolkit Visualizer, lttv, is a modular and extensible |
| 11 | tool to read, analyze, annotate and display traces. It accesses traces through |
| 12 | the libltt API and produces either textual output or graphical output using |
| 13 | the GTK library. This document describes the architecture of lttv for |
| 14 | developers. |
| 15 | |
| 16 | <P>Lttv is a small executable which links to the trace reading API, libltt, |
| 17 | and to the glib and gobject base libraries. |
| 18 | By itself it contains just enough code to |
| 19 | convert a trace to a textual format and to load modules. |
| 20 | The public |
| 21 | functions defined in the main program are available to all modules. |
| 22 | A number of |
| 23 | <I>text</I> modules may be dynamically loaded to extend the capabilities of |
| 24 | lttv, for instance to compute and print various statistics. |
| 25 | |
| 26 | <P>A more elaborate module, traceView, dynamically links to the GTK library |
| 27 | and to a support library, libgtklttv. When loaded, it displays graphical |
| 28 | windows in which one or more viewers in subwindows may be used to browse |
| 29 | details of events in traces. A number of other graphical modules may be |
| 30 | dynamically loaded to offer a choice of different viewers (e.g., process, |
| 31 | CPU or block devices state versus time). |
| 32 | |
| 33 | <H2>Main program: main.c</H2> |
| 34 | |
| 35 | <P>The main program parses the command line options, loads the requested |
| 36 | modules and executes the hooks registered in the global attributes |
| 37 | (/hooks/main/before, /hooks/main/core, /hooks/main/after). |
| 38 | |
| 39 | <H3>Hooks for callbacks: hook.h (hook.c)</H3> |
| 40 | |
| 41 | <P>In a modular extensible application, each module registers callbacks to |
| 42 | insure that it gets called at appropriate times (e.g., after command line |
| 43 | options processing, at each event to compute statistics...). Hooks and lists |
| 44 | of hooks are defined for this purpose and are normally stored in the global |
| 45 | attributes under /hooks/*. |
| 46 | |
| 47 | <H3>Browsable data structures: iattribute.h (iattribute.c)</H3> |
| 48 | |
| 49 | <P>In several places, functions should operate on data structures for which the |
| 50 | list of members is extensible. For example, the statistics printing |
| 51 | module should not be |
| 52 | modified each time new statistics are added by other modules. |
| 53 | For this purpose, a gobject interface is defined in iattribute.h to |
| 54 | enumerate and access members in a data structure. Even if new modules |
| 55 | define custom data structures for efficiently storing statistics while they |
| 56 | are being computed, they will be generically accessible for the printing |
| 57 | routine as long as they implement the iattribute interface. |
| 58 | |
| 59 | <H3>Extensible data structures: attribute.h (attribute.c)</H3> |
| 60 | |
| 61 | <P>To allow each module to add its needed members to important data structures, |
| 62 | for instance new statistics for processes, the LttvAttributes type is |
| 63 | a container for named typed values. Each attribute has a textual key (name) |
| 64 | and an associated typed value. |
| 65 | It is similar to a C data structure except that the |
| 66 | number and type of the members can change dynamically. It may be accessed |
| 67 | either directly or through the iattribute interface. |
| 68 | |
| 69 | <P>Some members may be LttvAttributes objects, thus forming a tree of |
| 70 | attributes, not unlike hierarchical file systems or registries. This is used |
| 71 | for the global attributes, used to exchange information between modules. |
| 72 | Attributes are also attached to trace sets, traces and contexts to allow |
| 73 | storing arbitrary attributes. |
| 74 | |
| 75 | <H3>Modules: module.h (module.c)</H3> |
| 76 | |
| 77 | <P>The benefit of modules is to avoid recompiling the whole application when |
| 78 | adding new functionality. It also helps insuring that only the needed code |
| 79 | is loaded in memory. |
| 80 | |
| 81 | <P>Modules are loaded explicitly, being on the list of default modules or |
| 82 | requested by a command line option, with g_module_open. The functions in |
| 83 | the module are not directly accessible. |
| 84 | Indeed, direct, compiled in, references to their functions would be dangerous |
| 85 | since they would exist even before (if ever) the module is loaded. |
| 86 | Each module contains a function named <i>init</i>. Its handle is obtained by |
| 87 | the main program using g_module_symbol and is called. |
| 88 | The <i>init</i> function of the module |
| 89 | then calls everything it needs from the main program or from libraries, |
| 90 | typically registering callbacks in hooks lists stored in the global attributes. |
| 91 | No module function other than <i>init</i> is |
| 92 | directly called. Modules cannot see the functions from other modules since |
| 93 | they may or not be loaded at the same time. |
| 94 | |
| 95 | <P>The modules must see the declarations for the functions |
| 96 | used, from the main program and from libraries, by including the associated |
| 97 | .h files. The list of libraries used must be provided as argument when |
| 98 | a module is linked. This will insure that these libraries get loaded |
| 99 | automatically when that module is loaded. |
| 100 | |
| 101 | <P>Libraries contain a number of functions available to modules and to the main |
| 102 | program. They are loaded automatically at start time if linked by the main |
| 103 | program or at module load time if linked by that module. Libraries are |
| 104 | useful to contain functions needed by several modules. Indeed, functions |
| 105 | used by a single module could be simply part of that module. |
| 106 | |
| 107 | <P>A list of loaded modules is maintained. When a module is requested, it |
| 108 | is verified if the module is already loaded. A module may request other modules |
| 109 | at the beginning of its init function. This will insure that these modules |
| 110 | get loaded and initialized before the init function of the current module |
| 111 | proceeds. Circular dependencies are obviously to be avoided as the |
| 112 | initialization order among mutually dependent modules will be arbitrary. |
| 113 | |
| 114 | <H3>Command line options: option.h (option.c)</H3> |
| 115 | |
| 116 | <P>Command line options are added as needed by the main program and by modules |
| 117 | as they are loaded. Thus, while options are scanned and acted upon (i.e., |
| 118 | options to load modules), the |
| 119 | list of options to recognize continues to grow. The options module registers |
| 120 | to get called by /hooks/main/before. It offers hooks /hooks/option/before |
| 121 | and /hooks/option/after which are called just before and just after |
| 122 | processing the options. Many modules register in their init function to |
| 123 | be called in /hooks/options/after to verify the options specified and |
| 124 | register further hooks accordingly. |
| 125 | |
| 126 | <H2>Trace Analysis</H2> |
| 127 | |
| 128 | <P>The main purpose of the lttv application is to process trace sets, |
| 129 | calling registered hooks for each event in the traces and maintaining |
| 130 | a context (system state, accumulated statistics). |
| 131 | |
| 132 | <H3>Trace Sets: traceSet.h (traceSet.c)</H3> |
| 133 | |
| 134 | <P>Trace sets are defined such that several traces can be analyzed together. |
| 135 | Traces may be added and removed as needed to a trace set. |
| 136 | The main program stores a trace set in /trace_set/default. |
| 137 | The content of the trace_set is defined by command line options and it is |
| 138 | used by analysis modules (batch or interactive). |
| 139 | |
| 140 | <H3>Trace Set Analysis: processTrace.h (processTrace.c)</H3> |
| 141 | |
| 142 | <p>The function <i>lttv_process_trace_set</i> loops over all the events |
| 143 | in the specified trace set for the specified time interval. <I>Before</I> |
| 144 | Hooks are first |
| 145 | called for the trace set and for each trace and tracefile |
| 146 | (one per cpu plus control tracefiles) in the trace set. |
| 147 | Then hooks are called for |
| 148 | each event in sorted time order. Finally, <i>after</i> hooks are called |
| 149 | for the trace set and for each trace and tracefile in it. |
| 150 | |
| 151 | <P>To call all the event hooks in sorted time order, a priority queue |
| 152 | (or sorted tree) is used. The first event from each tracefile is read and its |
| 153 | time used as key in the sorted tree. The event with the lowest key is removed |
| 154 | from the tree, the next event from that tracefile is read and reinserted in |
| 155 | the tree. |
| 156 | |
| 157 | <p>Each hook is called with a LttvContext gobject as call data. The LttvContext |
| 158 | object for the trace set before/after hooks is provided in the call to |
| 159 | lttv_process_trace_set. Shallow copies of this context are made for each |
| 160 | trace in the trace set for the trace before/after hooks. Again, shallow |
| 161 | copies of each trace context are made for each tracefile in a trace. |
| 162 | The context for each tracefile is used both for the tracefile before/after |
| 163 | hooks and when calling the hooks for the contained events. |
| 164 | |
| 165 | <p>The lttv_process_trace_set function sets appropriately the fields in the |
| 166 | context before calling a hook. For example, when calling a hook event, |
| 167 | the context contains: |
| 168 | |
| 169 | <DL> |
| 170 | <DT>trace_set_context<DD> context for the trace set. |
| 171 | <DT>trace_context<DD> context for the trace. |
| 172 | <DT>ts<DD> trace set. |
| 173 | <DT>t<DD> trace. |
| 174 | <DT>tf<DD> tracefile. |
| 175 | <DT>e<DD> event. |
| 176 | </DL> |
| 177 | |
| 178 | <P>The cost of providing all this information in the context is relatively |
| 179 | low. When calling a hook from one event to the next, in the same tracefile, |
| 180 | only the event field needs to be changed. |
| 181 | The contexts used when processing traces are key to extensibility and |
| 182 | performance. New modules may need additional data members in the context to |
| 183 | store intermediate results. For this purpose, it is possible to derive |
| 184 | subtypes of LttvContext in order to add new data members. |
| 185 | |
| 186 | |
| 187 | <H3>Reconstructing the system state from the trace: state.h (state.c)</H3> |
| 188 | |
| 189 | <P>The events in a trace often represent state transitions in the traced |
| 190 | system. When the trace is processed, and events accessed in time sorted |
| 191 | order, it is thus possible to reconstruct in part the state of the |
| 192 | traced system: state of each CPU, process, disk queue. The state of each |
| 193 | process may contain detailed information such as opened file descriptors |
| 194 | and memory map if needed by the analysis and if sufficient information is |
| 195 | available in the trace. This incrementally updated state information may be |
| 196 | used to display state graphs, or simply to compute state dependent |
| 197 | statistics (time spent in user or system mode, waiting for a file...). |
| 198 | |
| 199 | <P> |
| 200 | When tracing starts, at T0, no state is available. The OS state may be |
| 201 | obtained through "initial state" events which enumerate the important OS data |
| 202 | structures. Unless the state is obtained atomically, other events |
| 203 | describing state changes may be interleaved in the trace and must be |
| 204 | processed in the correct order. Once all the special initial state |
| 205 | events are obtained, at Ts, the complete state is available. From there the |
| 206 | system state can be deduced incrementally from the events in the trace. |
| 207 | |
| 208 | <P> |
| 209 | Analysis tools must be prepared for missing state information. In some cases |
| 210 | only a subset of events is traced, in others the trace may be truncated |
| 211 | in <i>flight recorder</i> mode. |
| 212 | |
| 213 | <P> |
| 214 | In interactive processing, the interval for which processing is required |
| 215 | varies. After scrolling a viewer, the events in the new interval to display |
| 216 | need to be processed in order to redraw the view. To avoid restarting |
| 217 | the processing at the trace start to reconstruct incrementally the system |
| 218 | state, the computed state may be memorized at regular interval, for example at |
| 219 | each 100 000 events, in a time indexed database associated with a trace. |
| 220 | To conserve space, it may be possible in some cases to only store state |
| 221 | differences. |
| 222 | |
| 223 | <p>To process a specific time interval, the state at the beginning of the |
| 224 | interval would be obtained by copying the last preceeding saved state |
| 225 | and processing the events since then to update the state. |
| 226 | |
| 227 | <p>A new subtype of LttvContext, LttvStateContext, is defined to add storage |
| 228 | for the state information. It defines a trace set state as a set of trace |
| 229 | state. The trace state is composed of processes, CPUs and block devices. |
| 230 | Each CPU has a currently executing process and each process state keeps |
| 231 | track the interrupt stack frames (faults, interrupts, |
| 232 | system calls), executable file name and other information such as opened |
| 233 | file descriptors. Each frame stores the process status, entry time |
| 234 | and last status change time. |
| 235 | |
| 236 | <p>File state.c provides state updating hooks to be called when the trace is |
| 237 | processed. When a scheduling change event is delivered to the hook, for |
| 238 | instance, the current process for the CPU is changed and the state of the |
| 239 | incoming and outgoing processes is changed. |
| 240 | The state updating hooks are stored in the global attributes under |
| 241 | /hooks/state/core/trace_set/before, after, |
| 242 | /hooks/state/core/trace/before, after... |
| 243 | to be used by processing functions requiring state updating (batch and |
| 244 | interactive alalysis, computing the state at time T by updating a preceeding |
| 245 | saved state...). |
| 246 | |
| 247 | <H3>Computing Statistics: stats.h (stats.c)</H3> |
| 248 | |
| 249 | <p>This file defines a subtype of LttvStateContext, LttvStatsContext, |
| 250 | to store statistics on various aspects of a trace set. The LttvTraceSetStats |
| 251 | structure contains a set of LttvTraceStats structures. Each such structure |
| 252 | contains structures for CPUs, processes, interrupt types (IRQ, system call, |
| 253 | fault), subtypes (individual system calls, IRQs or faults) and |
| 254 | block devices. The CPUs also contain structures for processes, interrupt types, |
| 255 | subtypes and block devices. Process structures similarly contain |
| 256 | structures for interrupt types, subtypes and block devices. At each level |
| 257 | (trace set, trace, cpu, process, interrupt stack frames) |
| 258 | attributes are used to store statistics. |
| 259 | |
| 260 | <p>File stats.c provides statistics computing hooks to be called when the |
| 261 | trace is processed. For example, when a <i>write</i> event is processed, |
| 262 | the attribute <i>BytesWritten</i> in the corresponding system, cpu, process, |
| 263 | interrupt type (e.g. system call) and subtype (e.g. write) is incremented |
| 264 | by the number of bytes stored in the event. When the processing is finished, |
| 265 | perhaps in the after hooks, the number of bytes written and other statistics |
| 266 | may be summed over all CPUs for a given process, over all processes for a |
| 267 | given CPU or over all traces. |
| 268 | |
| 269 | <p>The basic set of statistics computed by stats.c include for the whole |
| 270 | trace set: |
| 271 | |
| 272 | <UL> |
| 273 | <LI>Trace start time, end time and duration. |
| 274 | <LI>Total number of events. |
| 275 | <LI>Number of each event type (Interrupts, faults, system calls...) |
| 276 | <LI>For each interrupt type and each subtype, the number of each event type. |
| 277 | <LI>For each system: |
| 278 | <UL> |
| 279 | <LI>Total number of events. |
| 280 | <LI>Number of each event type (Interrupts, faults, system calls...) |
| 281 | <LI>For each interrupt type and each subtype, the number of each event type. |
| 282 | <LI>For each CPU: |
| 283 | <UL> |
| 284 | <LI> CPU id |
| 285 | <LI> User/System time |
| 286 | <LI> Number of each event type |
| 287 | <LI> For each interrupt type and each subtype, |
| 288 | the number of each event type. |
| 289 | </UL> |
| 290 | <LI>For each block device: |
| 291 | <UL> |
| 292 | <LI> block device name |
| 293 | <LI> time busy/idle, average queue length |
| 294 | <LI> Number of each relevant event type (requests added, merged, served) |
| 295 | </UL> |
| 296 | <LI>For each process: |
| 297 | <UL> |
| 298 | <LI> Exec'ed file names. |
| 299 | <LI> Start and end time, User/System time |
| 300 | <LI> Number of each event type |
| 301 | <LI> For each interrupt type and each subtype, |
| 302 | the number of each event type. |
| 303 | </UL> |
| 304 | </UL> |
| 305 | </UL> |
| 306 | |
| 307 | <P>The structure to store statistics differs from the state storage structure |
| 308 | in several ways. Statistics are maintained in different ways (per CPU all |
| 309 | processes, per process all CPUs, per process on a given CPU...). Furthermore, |
| 310 | statistics are maintained for all processes which existed during the trace |
| 311 | while the state at time T only stores information about current processes. |
| 312 | |
| 313 | <P>The hooks defined by stats.c are stored in the global attributes under |
| 314 | /hooks/stats/core/trace_set/before, after, |
| 315 | /hooks/stats/core/trace/before, after to be used by processing functions |
| 316 | interested in statistics. |
| 317 | |
| 318 | <H3>Filtering events: filter.h (filter.c)</H3> |
| 319 | |
| 320 | <P> |
| 321 | Filters are used to select which events in a trace are shown in a viewer or are |
| 322 | used in a computation. The filtering rules are based on the values of |
| 323 | events fields. The filter module receives a filter expression and computes |
| 324 | a compiled filter. The compiled filter then serves as hook data for |
| 325 | <i>check</i> event |
| 326 | filter hooks which, given a context containing an event, |
| 327 | return TRUE or FALSE to |
| 328 | indicate if the event satisfies the filter. Trace and tracefile <i>check</i> |
| 329 | filter hooks |
| 330 | may be used to determine if a system and CPU satisfy the filter. Finally, |
| 331 | the filter module has a function to return the time bounds, if any, imposed |
| 332 | by a filter. |
| 333 | |
| 334 | <P>For some applications, the hooks provided by the filter module may not |
| 335 | be sufficient, since they are based on simple boolean combinations |
| 336 | of comparisons between fields and constants. In that case, custom code may be |
| 337 | used for <i>check</i> hooks during the processing. An example of complex |
| 338 | filtering could be to only show events belonging to processes which consumed |
| 339 | more than 10% of the CPU in the last 10 seconds. |
| 340 | |
| 341 | <p>In module filter.c, filters are specified using textual expressions |
| 342 | with AND, OR, NOT operations on |
| 343 | nested subexpressions. Primitive expressions compare an event field to |
| 344 | a constant. In the graphical user interface, a filter editor is provided. |
| 345 | |
| 346 | <PRE><TT> |
| 347 | tokens: ( ! && || == <= >= > < != name [ ] int float string ) |
| 348 | |
| 349 | expression = ( expression ) OR ! expression OR |
| 350 | expression && expression OR expression || expression OR |
| 351 | simple_expression |
| 352 | |
| 353 | simple_expression = field_selector OP value |
| 354 | |
| 355 | value = int OR float OR string OR enum |
| 356 | |
| 357 | field_selector = component OR component . field_selector |
| 358 | |
| 359 | component = name OR name [ int ] |
| 360 | </TT></PRE> |
| 361 | |
| 362 | |
| 363 | <H3>Batch Analysis: batchAnalysis.h (batchAnalysis.c)</H3> |
| 364 | |
| 365 | <p>This module registers to be called by the main program (/hooks/main/core). |
| 366 | When called, it gets the current trace set (/trace_set/default), |
| 367 | state updating hooks (/hooks/state/*) the statistics hooks |
| 368 | (/hooks/stats/*) and other analysis hooks (/hooks/batch/*) |
| 369 | and runs lttv_process_trace_set for the entire |
| 370 | trace set time interval. This simple processing of the complete trace set |
| 371 | is normally sufficient for batch operations such as converting a trace to |
| 372 | text and computing various statistics. |
| 373 | |
| 374 | |
| 375 | <H3>Text output for events and statistics: textDump.h (textDump.c)</H3> |
| 376 | |
| 377 | <P> |
| 378 | This module registers hooks (/hooks/batch) |
| 379 | to print a textual representation of each event |
| 380 | (event hooks) and to print the content of the statistics accumulated in the |
| 381 | context (after trace set hook). |
| 382 | |
| 383 | <H2>Trace Set Viewers</H2> |
| 384 | |
| 385 | <p> |
| 386 | A library, libgtklttv, is defined to provide utility functions for |
| 387 | the second set of modules, wich compose the interactive graphical user |
| 388 | interface. It offers functions to create and interact with top level trace |
| 389 | viewing windows, and to insert specialized embedded viewer modules. |
| 390 | The libgtklttv library requires the gtk library. |
| 391 | The viewer modules include a detailed event list, eventsTableView, |
| 392 | a process state graph, processStateView, and a CPU state graph, cpuStateView. |
| 393 | |
| 394 | <p> |
| 395 | The top level gtkTraceSet, defined in libgtklttv, |
| 396 | window has the usual FILE EDIT... menu and a toolbar. |
| 397 | It has an associated trace set (and filter) and contains several tabs, each |
| 398 | containing several vertically stacked time synchronized trace set viewers. |
| 399 | It manages the space allocated to each contained viewer, the menu items and |
| 400 | tools registered by each contained viewer and the current time and current |
| 401 | time interval. |
| 402 | |
| 403 | <P> |
| 404 | When viewers change the current time or time interval, the gtkTraceSet |
| 405 | window notifies all contained viewers. When one or more viewers need |
| 406 | redrawing, the gtkTraceSet window calls the lttv_process_trace_set |
| 407 | function for the needed time interval, after computing the system state |
| 408 | for the interval start time. While events are processed, drawing hooks |
| 409 | from the viewers are called. |
| 410 | |
| 411 | <P> |
| 412 | TO COMPLETE; description and motivation for the gtkTraceSet widget structure |
| 413 | and interaction with viewers. Description and motivation for the detailed |
| 414 | event view and process state view. |
| 415 | |
| 416 | </BODY> |
| 417 | </HTML> |