86dab0cf |
1 | |
2 | Tracepoint proposal |
3 | |
4 | - Tracepoint infrastructure |
5 | - In-kernel users |
6 | - Complete typing, verified by the compiler |
7 | - Dynamically linked and activated |
8 | |
9 | - Marker infrastructure |
10 | - Exported API to userland |
11 | - Basic types only |
12 | |
13 | - Dynamic vs static |
14 | - In-kernel probes are dynamically linked, dynamically activated, connected to |
15 | tracepoints. Type verification is done at compile-time. Those in-kernel |
16 | probes can be a probe extracting the information to put in a marker or a |
17 | specific in-kernel tracer such as ftrace. |
18 | - Information sinks (LTTng, SystemTAP) are dynamically connected to the |
19 | markers inserted in the probes and are dynamically activated. |
20 | |
21 | - Near instrumentation site vs in a separate tracer module |
22 | |
23 | A probe module, only if provided with the kernel tree, could connect to internal |
24 | tracing sites. This argues for keeping the tracepoing probes near the |
25 | instrumentation site code. However, if a tracer is general purpose and exports |
26 | typing information to userspace through some mechanism, it should only export |
27 | the "basic type" information and could be therefore shipped outside of the |
28 | kernel tree. |
29 | |
30 | In-kernel probes should be integrated to the kernel tree. They would be close to |
31 | the instrumented kernel code and would translate between the in-kernel |
32 | instrumentation and the "basic type" exports. Other in-kernel probes could |
33 | provide a different output (statistics available through debugfs for instance). |
34 | ftrace falls into this category. |
35 | |
36 | Generic or specialized information "sinks" (LTTng, systemtap) could be connected |
37 | to the markers put in tracepoint probes to extract the information to userspace. |
38 | They would extract both typing information and the per-tracepoint execution |
39 | information to userspace. |
40 | |
41 | Therefore, the code would look like : |
42 | |
43 | kernel/sched.c: |
44 | |
45 | #include "sched-trace.h" |
46 | |
47 | schedule() |
48 | { |
49 | ... |
50 | trace_sched_switch(prev, next); |
51 | ... |
52 | } |
53 | |
54 | |
55 | kernel/sched-trace.h: |
56 | |
57 | DEFINE_TRACE(sched_switch, struct task_struct *prev, struct task_struct *next); |
58 | |
59 | |
60 | kernel/sched-trace.c: |
61 | |
62 | #include "sched-trace.h" |
63 | |
64 | static probe_sched_switch(struct task_struct *prev, struct task_struct |
65 | *next) |
66 | { |
67 | trace_mark(kernel_sched_switch, "prev_pid %d next_pid %d prev_state %ld", |
68 | prev->pid, next->pid, prev->state); |
69 | } |
70 | |
71 | int __init init(void) |
72 | { |
73 | return register_sched_switch(probe_sched_switch); |
74 | } |
75 | |
76 | void __exit exit(void) |
77 | { |
78 | unregister_sched_switch(probe_sched_switch); |
79 | } |
80 | |
81 | |
82 | Where DEFINE_TRACE internals declare a structure, a trace_* inline function, |
83 | a register_trace_* and unregister_trace_* inline functions : |
84 | |
85 | static instrumentation site structure, containing function pointers to |
86 | deactivated functions and activation boolean. It also contains the |
87 | "sched_switch" string. This structure is placed in a special section to create |
88 | an array of these structures. |
89 | |
90 | static inline void trace_sched_switch(struct task_struct *prev, |
91 | struct task_struct *next) |
92 | { |
93 | if (sched_switch tracing is activated) |
94 | marshall_probes(&instrumentation_site_structure, prev, next); |
95 | } |
96 | |
97 | static inline int register_trace_sched_switch( |
98 | void (*probe)(struct task_struct *prev, struct task_struct *next) |
99 | { |
100 | return do_register_probe("sched_switch", (void *)probe); |
101 | } |
102 | |
103 | static inline void unregister_trace_sched_switch( |
104 | void (*probe)(struct task_struct *prev, struct task_struct *next) |
105 | { |
106 | do_unregister_probe("sched_switch", (void *)probe); |
107 | } |
108 | |
109 | |
110 | We need a a new kernel probe API : |
111 | |
112 | do_register_probe / do_unregister_probe |
113 | - Connects the in-kernel probe to the site |
114 | - Activates the site tracing (probe reference counting) |
115 | |
116 | |