Add new thread in consumer for metadata handling
To prioritize the consumption of the metadata, this patch introduce a
new thread in the consumer which exclusively handles metadata in order
to separate them from the trace data.
The motivation behind this change is that once a start command is done
on the tracer (kernel or UST), the start waits up to 10 seconds for the
metadata to be written (LTTNG_METADATA_TIMEOUT_MSEC). However, there is
a case where there is not enough space in the metadata buffers and the
tracer waits so to not drop data. After the timeout, if the write(s) is
unsuccessful, the start session command fails.
The previous problem can occur with network streaming with high
throughput data such as enable-event -a -k and a low bandwitdh
connection.
The separation between metadata and trace data does the trick where
consuming metadata does not depend anymore on the arbitrary time to
stream trace data while metadata buffers needs to get consumed.
Of course, this fix is more _visible_ on multiprocessor/core machines
but can also help on single processor to prioritize metadata
consumption.
It helps on single-processor too because the scheduler will schedule
both the data and metadata threads. Even if the data thread need to send
many MB of data, if the metadata thread sends small enough metadata we
should be good with half of the CPU time.
I see that the metadata reaches easily 192k for kernel traces though. On
a 5KB/s connection, this sums up to 38s. However, thanks to the fact
that the 10s delay is allowed between each sub-buffer, we don't reach
the limit. This limits us to small trace packet sizes though, if we ever
have lots of metadata. E.g. on a 5KB/s connection, metadata buffers
configured as 2x64KB, with metadata size of e.g. 512KB, would trigger
the 10s delay error.
So we should be good for now, but removing this arbitrary 10s delay is
something to keep in mind as future improvement.
Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: David Goulet <dgoulet@efficios.com>
This page took 0.027662 seconds and 4 git commands to generate.