update compat
[lttv.git] / doc / developer / lttng-atomic-up.txt
1
2 Atomic UP test results.
3
4
5
6
7 using test-time-probe2.ko
8
9 Clock speed : cpu MHz : 3000.077
10
11 Tracing inactive
12
13 [ 125.787229] test init
14 [ 125.787303] test results : time per probe
15 [ 125.787306] number of loops : 20000
16 [ 125.787309] total time : 204413
17 [ 125.787312] test end
18 [ 175.660402] test init
19 [ 175.660475] test results : time per probe
20 [ 175.660479] number of loops : 20000
21 [ 175.660482] total time : 203468
22 [ 175.660484] test end
23 [ 179.337362] test init
24 [ 179.337436] test results : time per probe
25 [ 179.337440] number of loops : 20000
26 [ 179.337443] total time : 204757
27 [ 179.337446] test end
28
29 Res : 10.21 cycles per loop
30
31 Atomic UP, one trace, flight recorder.
32
33 [ 357.983971] test init
34 [ 357.988837] test results : time per probe
35 [ 357.988843] number of loops : 20000
36 [ 357.988846] total time : 12349013
37 [ 357.988849] test end
38 [ 358.718896] test init
39 [ 358.723049] test results : time per probe
40 [ 358.723053] number of loops : 20000
41 [ 358.723057] total time : 12332497
42 [ 358.723059] test end
43 [ 359.422038] test init
44 [ 359.426173] test results : time per probe
45 [ 359.426179] number of loops : 20000
46 [ 359.426182] total time : 12332535
47 [ 359.426185] test end
48
49 Res : 616.90 cycles per loop.
50 205.63 ns per loop
51
52 Atomic SMP, one trace, flight.
53
54
55 [ 111.694180] test init
56 [ 111.700191] test results : time per probe
57 [ 111.700198] number of loops : 20000
58 [ 111.700201] total time : 16925670
59 [ 111.700204] test end
60 [ 112.285716] test init
61 [ 112.291321] test results : time per probe
62 [ 112.291326] number of loops : 20000
63 [ 112.291329] total time : 16766633
64 [ 112.291332] test end
65 [ 112.880602] test init
66 [ 112.884739] test results : time per probe
67 [ 112.884743] number of loops : 20000
68 [ 112.884746] total time : 12358237
69 [ 112.884748] test end
70
71 Res : 767.51 cycles per loop
72 255.83 ns per loop
73
74 (205.63-255.83)/255.83 * 100% = 19.62 %
75
76
77 Difference between
78 cmpxchg 2967855/20000 = 148.39 cycles or 49.46 ns
79 cmpxchg-up 540577/20000 = 27.02 cycles or 9.00 ns
80 irq save/restore 12636562/20000 = 631.82 cycles 210.60 ns
81
82
83
84 * Memory ordering
85
86 offset
87 written by local CPU
88 read by local CPU and other CPUs (reader)
89
90 commit count
91 written by local CPU
92 read by local CPU and other CPUs (reader)
93
94 consumed
95 written by any CPU
96 read by any CPU
97
98 data
99 written by local CPU
100 read by any CPU
101
102
103 test done in the reader :
104 if ( consumed < offset )
105 if ( subbuf.commit_count == multiple of SUBBUFSIZE)
106 read data
107 inc consumed
108
109
110 We must guarantee the following ordering :
111 * offset
112 Seen from the local CPU :
113 offset must always be incremented before the data is written (already
114 consistent)
115
116 Seen from other cpus :
117 offset and data can be written out of order
118 (because offset is always incremented : in an out of order case, offset is lower
119 than the actual data ready, but the commit_count _has_ to be incremented to read
120 the data (and is preceded by a store fence)
121
122 * commit_count
123 commit_count must always be seen by other CPUs after the data has been written.
124 Therefore, we must put a store fence before the commit_count write. (smp_wmb)
125
126 * consumed
127 Rarely updated, use LOCK prefix. Acts as a full memory barrier.
128
129
130
131 Mathieu Desnoyers, November 2006
This page took 0.032036 seconds and 4 git commands to generate.