Fix: ring buffer: RING_BUFFER_FLUSH ioctl buffer corruption
lib_ring_buffer_switch_slow() clearly states:
* Note, however, that as a v_cmpxchg is used for some atomic
* operations, this function must be called from the CPU which owns the
* buffer for a ACTIVE flush.
But unfortunately, the RING_BUFFER_FLUSH ioctl does not follow these
important directives. Therefore, whenever the consumer daemon or session
daemon explicitly triggers a "flush" on a buffer, it can race with data
being written to the buffer, leading to corruption of the reserve/commit
counters, and therefore corruption of data in the buffer. It triggers
these warnings for overwrite mode buffers:
[65356.890016] WARNING: at
/home/compudj/git/lttng-modules/wrapper/ringbuffer/../../lib/ringbuffer/../../wrapper/ringbuffer/../../lib/ringbuffer/backend.h:110 lttng_event_write+0x118/0x140 [lttng_ring_buffer_client_mmap_overwrite]()
Which indicates that we are trying to write into a sub-buffer for which
we don't have exclusive access. It also causes the following warnings to
show up:
[65335.896208] ring buffer relay-overwrite-mmap, cpu 5: records were lost. Caused by:
[65335.896208] [ 0 buffer full, 80910 nest buffer wrap-around, 0 event too big ]
Which is caused by corrupted commit counter.
Fix this by sending an IPI to the CPU owning the flushed buffer for
per-cpu synchronization. For global synchronization, no IPI is needed,
since we allow writes from remote CPUs.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
This page took 0.026931 seconds and 4 git commands to generate.