[lttv.git] / ltt / branches / poly / doc / developer / ltt-to-do.html

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
  <title>Linux Trace Toolkit Status</title>
</head>
  <body>
        
<h1>Linux Trace Toolkit Status</h1>
        
<p><i>Last updated July 1, 2003.</i> </p>
     
<p>During the 2002 Ottawa Linux Symposium tracing BOF, a list of desirable 
  features for LTT was collected by Richard Moore. Since then, a lot of infrastructure 
 work on LTT has been taking place. This status report aims to track current 
 development efforts and the current status of the various features. This
status page is most certainly incomplete, please send 
any additions and corrections to Michel Dagenais (michel.dagenais at polymtl.ca)</p>

<p>As of this writing, the most active LTT contributors include Karim Yaghmour,
author and maintainer from opersys.com, Tom Zanussi, Robert Wisniewski,
Richard J Moore and others from IBM, mainly at the Linux Technology Center,
XiangXiu Yang, Mathieu Desnoyers, Benoit des Ligneris and Michel Dagenais,
from the department of Computer Engineering at Ecole Polytechnique de
Montreal, and Frank Rowand, from Monte Vista.</p>

<h2>Work recently performed</h2>
        
<p><b>Lockless per cpu buffers:</b> Tom Zanussi of IBM has implemented per CPU lockless buffering,  with low
overhead very fine grained timestamping, and has updated accordingly  the
kernel patch and the trace visualizer except for viewing multiple per CPU
traces simultaneously.  </p>
     
<p><b>RelayFS:</b> Tom Zanussi has implemented RelayFS, a separate, simple 
and efficient component for moving data between the kernel and user space
applications. This component is reusable by other projects (printk, evlog, 
lustre...) and removes a sizeable chunk from the current LTT, making each 
piece (relayfs and relayfs-based LTT) simpler, more modular and possibly 
more palatable for inclusion in the standard Linux kernel. Besides LTT on
RelayFS, He has implemented printk over RelayFS with an automatically 
resizeable printk buffer. </p>

<p><b>New trace format:</b> Karim Yaghmour and Michel Dagenais, with input
from several LTT contributors, have designed a new trace format to accomodate
per buffer tracefiles and dynamically defined event types. The new format
includes both the binary trace format and the event type description format.
XiangXiu Yang has developed a simple parser for the event type description 
format. This parser is used to generate the tracing macros in the kernel
(genevent) and to support reading tracefiles in the trace reading library
(libltt).

<h2>Ongoing work</h2>

<p><b>Libltt:</b> XiangXiu Yang is finishing up an event reading library
and API which parses event descriptions and accordingly reads traces and
decodes events.  </p>
     
<p><b>lttv:</b> XiangXiu Yang, Mathieu Desnoyers and Michel Dagenais are
remodeling the trace visualizer to use the new trace format and libltt API,
and to allow compiled and scripted plugins, which can dynamically 
add new custom trace analysis functions.  </p>
     
<h2>Planned work</h2>
        
<p>LTT already interfaces with Dynamic Probes. This feature will need to
be updated for the new LTT version.   </p>
     
<p>The Kernel Crash Dump utilities is  another very interesting complementary 
 project. Interfacing it with RelayFS will help implement useful 
flight-recorder like tracing for post-mortem analysis.  </p>
     
<p>User level tracing is available in the current LTT version but requires
one system call per event. With the new RelayFS based infrastructure, it
would be interesting to use a shared memory buffer directly accessible from
user space. Having one RelayFS   channel per user would allow an extremely
efficient, yet secure, user level  tracing mechanism.  </p>
     
<p>Sending important events (process creation, event types/facilities
definitions) to a separate channel could be used to browse traces
interactively more efficiently.  Only this concise trace of important
events would need to be processed in its entirety, other larger
gigabyte size traces could be used in random access without requiring
a first preprocessing pass. A separate channel would also be required
in case of incomplete traces such as when tracing to a circular buffer
in "flight recorder" mode; the important events would all be kept
while only the last buffers of ordinary events would be kept.  </p>
     
<p>Once the visualizer is able to read and display several traces, it
  will be interesting to produce side by side synchronized views
  (events from two interacting machines A and B one above the other)
  or even merged views (combined events from several CPUs in a single
  merged graph). Time differences between interacting systems will
  need to be estimated and somewhat compensated for.  </p>
     
<p>LTT currently writes a <i>proc</i> file at trace start time. This
  file only contains minimal information about processes and
  interrupts names.  More information would be desirable for several
  applications (process maps, opened descriptors, content of buffer
  cache). Furthermore, this information may be more conveniently
  gathered from within the kernel and simply written to the trace as
  events at start time.  </p>
     
<h2>New features already implemented since LTT 0.9.5</h2>
        
<ol>
    <li> Per-CPU Buffering scheme. </li>
     <li> Logging without locking. </li>
     <li> Minimal latency - minimal or no serialisation. (<i>Lockless tracing
using  read_cycle_counter instead of gettimeofday.</i>) </li>
               <li> Fine granularity time stamping - min=o(CPU cycle time),
max=.05 Gb  Ethernet interrupt rate. (<i>Cycle counter being used</i>). </li>
     <li> Random access to trace event stream. (<i>Random access reading
of  events  in the trace is already available in LibLTT. However, one first
pass   is required through the trace to find all the process creation events;
the  cost of this first pass may be reduced in the future if process creation
 events are sent to a separate much smaller trace</i>.) </li>
     
</ol>
        
<h2>Features being worked on</h2>
        
<ol>
    <li> Simple wrapper macros for trace instrumentation. (<i>GenEvent</i>)
   </li>
     <li> Easily expandable with new trace types.  (<i>GenEvent</i>) </li>
     <li> Multiple buffering schemes - switchable globally or selectable
by  trace client. (<i>Will be simpler to obtain with RelayFS</i>.) </li>
     <li> Global buffer scheme. (<i>Will be simpler to obtain with RelayFS</i>.)
    </li>
     <li> Per-process buffer scheme. (<i>Will be simpler to obtain with RelayFS.</i>)
    </li>
     <li> Per-NGPT thread buffer scheme. (<i>Will be simpler to obtain with 
 RelayFS</i>.) </li>
     <li> Per-component buffer scheme. (<i>Will be simpler to obtain with 
RelayFS</i>.)    </li>
          <li> A set of extensible and modular performance analysis post-processing
programs. (<i>Lttv</i>)     </li>
  <li> Filtering and selection mechanisms within formatting utility. (<i>Lttv</i>)
    </li>
     <li> Variable size event records. (<i>GenEvent, LibEvent, Lttv</i>)
   </li>
     <li> Data reduction facilities able to logically combine traces  from
 more than one system. (<i>LibEvent, Lttv</i>) </li>
     <li> Data presentation utilities to be able to present data from multiple 
  trace instances in a logically combined form (<i>LibEvent, Lttv</i>) 
  </li>
     <li> Major/minor code means of identification/registration/assignment.
 (<i>GenEvent</i>)    </li>
     <li> A flexible formatting mechanism that will cater for structures
and  arrays of structures with recursion. (<i>GenEvent</i>) </li>
     
</ol>
        
<h2>Features already planned for</h2>
        
<ol>
    <li> Init-time tracing. (<i>To be part of RelayFS</i>.) </li>
     <li>Updated interface for Dynamic Probes. (<i>As soon as things stabilize.</i>)
    </li>
     <li> Support "flight recorder" always on tracing with minimal resource
consumption.  (<i>To be part of RelayFS and interfaced to the Kernel crash
dump   facilities.)</i>    </li>
     <li> Fine grained dynamic trace instrumentation for kernel space and 
user   subsystems. (<i>Dynamic Probes, more efficient user level tracing.</i>)</li>
     <li>System information logged at trace start. (<i>New special events 
to add</i>.)</li>
     <li>Collection of process memory map information at trace start/restart 
 and updates of that information at fork/exec/exit. This allows address-to-name 
  resolution for user space. </li>
     <li>Include the facility to write system snapshots (total memory  layout 
 for kernel, drivers, and all processes) to a file.  This is required  for 
 trace post-processing on a system other than the one producing the trace.
  Perhaps some of this is already implemented in the Kernel Crash Dump.</li>
     <li>Even more efficient tracing from user space.</li>
     <li>Better integration with tools to define static trace hooks.</li>
     <li> Better integration with tools to dynamically activate tracing statements.</li>
          
</ol>
        
<h2>Features not currently planned</h2>
        
<ol>
    <li>POSIX Tracing API compliance. </li>
     <li>Ability to do function entry/exit tracing facility. (<i>Probably 
 a totally orthogonal mechanism using either Dynamic Probes hooks or static
  code instrumentation using the suitable GCC options for basic blocks instrumentation.</i>)</li>
     <li>Processor performance counter (which most modern CPUs have) sampling 
and recording. (<i>These counters can be read and their value sent in traced 
events. Some support to collect these automatically at specific state change 
times and to visualize the results would be nice.)</i></li>
          <li>Suspend &amp; Resume capability. (<i>Why not simply stop the
 trace and start a new one later, otherwise important information like process
creations while suspended must be obtained in some other way.</i>)</li>
     <li>Per-packet send/receive event. (<i>New event types will be easily
added as needed.)</i></li>
               
</ol>
   <br>
     <br>

</body>
</html>
Commit	Line	Data
584db146	1	<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
	2	<html>
	3	<head>
	4	<title>Linux Trace Toolkit Status</title>
	5	</head>
	6	<body>
	7
	8	<h1>Linux Trace Toolkit Status</h1>
	9
	10	<p><i>Last updated July 1, 2003.</i> </p>
	11
	12	<p>During the 2002 Ottawa Linux Symposium tracing BOF, a list of desirable
	13	features for LTT was collected by Richard Moore. Since then, a lot of infrastructure
	14	work on LTT has been taking place. This status report aims to track current
	15	development efforts and the current status of the various features. This
	16	status page is most certainly incomplete, please send
	17	any additions and corrections to Michel Dagenais (michel.dagenais at polymtl.ca)</p>
	18
	19	<p>As of this writing, the most active LTT contributors include Karim Yaghmour,
	20	author and maintainer from opersys.com, Tom Zanussi, Robert Wisniewski,
	21	Richard J Moore and others from IBM, mainly at the Linux Technology Center,
	22	XiangXiu Yang, Mathieu Desnoyers, Benoit des Ligneris and Michel Dagenais,
	23	from the department of Computer Engineering at Ecole Polytechnique de
	24	Montreal, and Frank Rowand, from Monte Vista.</p>
	25
	26	<h2>Work recently performed</h2>
	27
	28	<p><b>Lockless per cpu buffers:</b> Tom Zanussi of IBM has implemented per CPU lockless buffering, with low
	29	overhead very fine grained timestamping, and has updated accordingly the
	30	kernel patch and the trace visualizer except for viewing multiple per CPU
	31	traces simultaneously. </p>
	32
	33	<p><b>RelayFS:</b> Tom Zanussi has implemented RelayFS, a separate, simple
	34	and efficient component for moving data between the kernel and user space
	35	applications. This component is reusable by other projects (printk, evlog,
	36	lustre...) and removes a sizeable chunk from the current LTT, making each
	37	piece (relayfs and relayfs-based LTT) simpler, more modular and possibly
	38	more palatable for inclusion in the standard Linux kernel. Besides LTT on
	39	RelayFS, He has implemented printk over RelayFS with an automatically
	40	resizeable printk buffer. </p>
	41
	42	<p><b>New trace format:</b> Karim Yaghmour and Michel Dagenais, with input
	43	from several LTT contributors, have designed a new trace format to accomodate
	44	per buffer tracefiles and dynamically defined event types. The new format
	45	includes both the binary trace format and the event type description format.
	46	XiangXiu Yang has developed a simple parser for the event type description
	47	format. This parser is used to generate the tracing macros in the kernel
	48	(genevent) and to support reading tracefiles in the trace reading library
	49	(libltt).
	50
	51	<h2>Ongoing work</h2>
	52
	53	<p><b>Libltt:</b> XiangXiu Yang is finishing up an event reading library
	54	and API which parses event descriptions and accordingly reads traces and
	55	decodes events. </p>
	56
	57	<p><b>lttv:</b> XiangXiu Yang, Mathieu Desnoyers and Michel Dagenais are
	58	remodeling the trace visualizer to use the new trace format and libltt API,
	59	and to allow compiled and scripted plugins, which can dynamically
	60	add new custom trace analysis functions. </p>
	61
	62	<h2>Planned work</h2>
	63
	64	<p>LTT already interfaces with Dynamic Probes. This feature will need to
65	be updated for the new LTT version. </p>
66
67	<p>The Kernel Crash Dump utilities is another very interesting complementary
68	project. Interfacing it with RelayFS will help implement useful
69	flight-recorder like tracing for post-mortem analysis. </p>
70
71	<p>User level tracing is available in the current LTT version but requires
72	one system call per event. With the new RelayFS based infrastructure, it
73	would be interesting to use a shared memory buffer directly accessible from
74	user space. Having one RelayFS channel per user would allow an extremely
75	efficient, yet secure, user level tracing mechanism. </p>
76
77	<p>Sending important events (process creation, event types/facilities
78	definitions) to a separate channel could be used to browse traces
79	interactively more efficiently. Only this concise trace of important
80	events would need to be processed in its entirety, other larger
81	gigabyte size traces could be used in random access without requiring
82	a first preprocessing pass. A separate channel would also be required
83	in case of incomplete traces such as when tracing to a circular buffer
84	in "flight recorder" mode; the important events would all be kept
85	while only the last buffers of ordinary events would be kept. </p>
86
87	<p>Once the visualizer is able to read and display several traces, it
88	will be interesting to produce side by side synchronized views
89	(events from two interacting machines A and B one above the other)
90	or even merged views (combined events from several CPUs in a single
91	merged graph). Time differences between interacting systems will
92	need to be estimated and somewhat compensated for. </p>
93
94	<p>LTT currently writes a <i>proc</i> file at trace start time. This
95	file only contains minimal information about processes and
96	interrupts names. More information would be desirable for several
97	applications (process maps, opened descriptors, content of buffer
98	cache). Furthermore, this information may be more conveniently
99	gathered from within the kernel and simply written to the trace as
100	events at start time. </p>
101
102	<h2>New features already implemented since LTT 0.9.5</h2>
103
104	<ol>
105	<li> Per-CPU Buffering scheme. </li>
106	<li> Logging without locking. </li>
107	<li> Minimal latency - minimal or no serialisation. (<i>Lockless tracing
108	using read_cycle_counter instead of gettimeofday.</i>) </li>
109	<li> Fine granularity time stamping - min=o(CPU cycle time),
110	max=.05 Gb Ethernet interrupt rate. (<i>Cycle counter being used</i>). </li>
111	<li> Random access to trace event stream. (<i>Random access reading
112	of events in the trace is already available in LibLTT. However, one first
113	pass is required through the trace to find all the process creation events;
114	the cost of this first pass may be reduced in the future if process creation
115	events are sent to a separate much smaller trace</i>.) </li>
116
117	</ol>
118
119	<h2>Features being worked on</h2>
120
121	<ol>
122	<li> Simple wrapper macros for trace instrumentation. (<i>GenEvent</i>)
123	</li>
124	<li> Easily expandable with new trace types. (<i>GenEvent</i>) </li>
125	<li> Multiple buffering schemes - switchable globally or selectable
126	by trace client. (<i>Will be simpler to obtain with RelayFS</i>.) </li>
127	<li> Global buffer scheme. (<i>Will be simpler to obtain with RelayFS</i>.)
128	</li>
129	<li> Per-process buffer scheme. (<i>Will be simpler to obtain with RelayFS.</i>)
130	</li>
131	<li> Per-NGPT thread buffer scheme. (<i>Will be simpler to obtain with
132	RelayFS</i>.) </li>
133	<li> Per-component buffer scheme. (<i>Will be simpler to obtain with
134	RelayFS</i>.) </li>
135	<li> A set of extensible and modular performance analysis post-processing
136	programs. (<i>Lttv</i>) </li>
137	<li> Filtering and selection mechanisms within formatting utility. (<i>Lttv</i>)
138	</li>
139	<li> Variable size event records. (<i>GenEvent, LibEvent, Lttv</i>)
140	</li>
141	<li> Data reduction facilities able to logically combine traces from
142	more than one system. (<i>LibEvent, Lttv</i>) </li>
143	<li> Data presentation utilities to be able to present data from multiple
144	trace instances in a logically combined form (<i>LibEvent, Lttv</i>)
145	</li>
146	<li> Major/minor code means of identification/registration/assignment.
147	(<i>GenEvent</i>) </li>
148	<li> A flexible formatting mechanism that will cater for structures
149	and arrays of structures with recursion. (<i>GenEvent</i>) </li>
150
151	</ol>
152
153	<h2>Features already planned for</h2>
154
155	<ol>
156	<li> Init-time tracing. (<i>To be part of RelayFS</i>.) </li>
157	<li>Updated interface for Dynamic Probes. (<i>As soon as things stabilize.</i>)
158	</li>
159	<li> Support "flight recorder" always on tracing with minimal resource
160	consumption. (<i>To be part of RelayFS and interfaced to the Kernel crash
161	dump facilities.)</i> </li>
162	<li> Fine grained dynamic trace instrumentation for kernel space and
163	user subsystems. (<i>Dynamic Probes, more efficient user level tracing.</i>)</li>
164	<li>System information logged at trace start. (<i>New special events
165	to add</i>.)</li>
166	<li>Collection of process memory map information at trace start/restart
167	and updates of that information at fork/exec/exit. This allows address-to-name
168	resolution for user space. </li>
169	<li>Include the facility to write system snapshots (total memory layout
170	for kernel, drivers, and all processes) to a file. This is required for
171	trace post-processing on a system other than the one producing the trace.
172	Perhaps some of this is already implemented in the Kernel Crash Dump.</li>
173	<li>Even more efficient tracing from user space.</li>
174	<li>Better integration with tools to define static trace hooks.</li>
175	<li> Better integration with tools to dynamically activate tracing statements.</li>
176
177	</ol>
178
179	<h2>Features not currently planned</h2>
180
181	<ol>
182	<li>POSIX Tracing API compliance. </li>
183	<li>Ability to do function entry/exit tracing facility. (<i>Probably
184	a totally orthogonal mechanism using either Dynamic Probes hooks or static
185	code instrumentation using the suitable GCC options for basic blocks instrumentation.</i>)</li>
186	<li>Processor performance counter (which most modern CPUs have) sampling
187	and recording. (<i>These counters can be read and their value sent in traced
188	events. Some support to collect these automatically at specific state change
189	times and to visualize the results would be nice.)</i></li>
190	<li>Suspend & Resume capability. (<i>Why not simply stop the
191	trace and start a new one later, otherwise important information like process
192	creations while suspended must be obtained in some other way.</i>)</li>
193	<li>Per-packet send/receive event. (<i>New event types will be easily
194	added as needed.)</i></li>
195
196	</ol>
197	<br>
198	<br>
199
200	</body>
201	</html>
202
203
204