11904fb6 |
1 | |
2 | LTTng synthetic TSC MSB |
3 | |
4 | Mathieu Desnoyers, Mars 1, 2006 |
5 | |
6 | A problem found on some architectures is that the TSC is limited to 32 bits, |
7 | which induces a wrap-around every 8 seconds or so. |
8 | |
9 | The wraps arounds are detectable by the use of a heartbeat timer, which |
10 | generates an event in each trace at periodic interval. It makes reading the |
11 | trace sequentially possible. |
12 | |
13 | What causes problem is fast time seek in the trace : it uses the buffer |
14 | boundary timestamps (64 bits) to seek to the right block in O(log(n)). It |
15 | cannot, however, read the trace sequentially. |
16 | |
17 | So the problem posed is the following : we want to generate a per cpu 64 bits |
18 | TSC from the available 32 bits with the 32 MSB generated synthetically. I should |
19 | be readable by the buffer switch event. |
20 | |
21 | The idea is the following : we keep a 32 bits previous_tsc value per cpu. It |
22 | helps detect the wrap around. Each time a heartbeat fires or a buffer switch |
23 | happens, the previous_tsc is read, and then written to the new value. If a wrap |
24 | around is detected, the msb_tsc for the cpu is atomically incremented. |
25 | |
26 | We are sure that there is only one heartbeat at a given time because they are |
27 | fired at fixed interval : typically 10 times per 32bit TSC wrap around. Even |
28 | better, as they are launched by a worker thread, it can only be queued once in |
29 | the worker queue. |
30 | |
31 | Now with buffer switch vs heartbeat concurrency. Worse case : a heartbeat is |
32 | happenning : one CPU is in process context (worker thread), the other ones are |
33 | in interrupt context (IPI). On one CPU in IPI, we have an NMI triggered that |
34 | generates a buffer switch. |
35 | |
36 | What is sure is that the heartbeat needs to read and write the previous_tsc. It |
37 | also needs to increment atomically the msb_tsc. However, the buffer switch only |
38 | needs to read the previous_tsc, compare it to the current tsc and read the |
39 | msb_tsc. |
40 | |
41 | Another race case is that the buffer switch can be interrupted by the heartbeat. |
42 | |
43 | So what we need is to have an atomic write. As the architecture does not support |
44 | 64 bits cmpxchg, we will need this little data structure to overcome this |
45 | problem : |
46 | |
47 | An array of two 64 bits elements. Elements are updated in two memory writes, but |
48 | the element switch (current element) is made atomically. As there is only one |
49 | writer, this has no locking problem. |
50 | |
1e3e9a0a |
51 | We make sure the synthetic tcs reader does not sleep by disabling preemption. We |
52 | do the same for the writer. |