Fix: extraneous empty/inactive flush on rotation out of a trace chunk
Observed issue
==============
A test (tests/regression/tools/tracefile-limits/test_tracefile_count)
occasionally fails on ppc64. The trace validation steps in the fails in
the case where the trace file count limit is set to 1.
Examining the resulting trace shows that the last packet of data
produced by the test application appears to be missing.
The test case enables a channel in "overwrite" mode. Normally, this
would guarantee that the last data produced will always be available in
the resulting trace.
Cause
=====
An empty/inactive flush is performed when rotating "out" of a non-null
trace chunk to ensure that the trace chunk contained at least one
packet.
Looking at the test's resulting trace and by following the consumerd
logs, we see that the test application runs on one CPU for most of its
lifetime. The stream file is repeatedly replaced to make room for the
latest data.
Eventually, the application is migrated to another CPU. A number of
packets are written to this new stream. The session is then stopped
which causes an active flush to occur to close the current packet of
all streams (see `ust_app_stop_trace_all`).
Then, when the session is destroyed, an empty/inactive flush is
performed to ensure that at least one packet was produced in the current
trace chunk [1].
At the moment of writing this empty packet, the consumer daemon sees
that there is not enough space left in the stream file to honour the
trace file size restriction. It thus overwrites the file resulting in
the loss of the last events to replace them with the empty "end of
chunk" packet that occupies a single page.
While the problem is not specific to PowerPC 64, it has a lot more
chances to occur there as pages are typically configured to be of 64kb
length. Due to current implementation limitations, empty packets have
a size of one page.
In other words, 4kb pages typically fit in the space left in the file,
causing the problem to not be easily reproducible on x64.
Note that while the file size limit is specified as "3 * PAGE_SIZE" in
the test, it is rounded-up to 512kb to accomodate at least one
sub-buffer.
Solution
========
[1] This empty/inactive flush is no longer necessary since
f96af312b
as an "open packet" (which performs an empty/inactive flush) is
performed when a stream enters a non-null trace chunk. There is no
concern that a trace chunk will be left empty unless this initial flush
fails (see patch comments and work-around).
The empty flush that was performed for data streams is converted
into an active flush under most circumstances; the packet is simply
closed.
Known drawbacks
===============
None.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I5602b7ab8318374f75060489cf9c27af4e058805