Introduce LTTNG_UST_MAP_POPULATE_POLICY environment variable
Problem Statement
-----------------
commit
4d4838bad480 ("Use MAP_POPULATE to reduce pagefault when available")
was first introduced in tag v2.11.0 and never backported to stable
branches. Its purpose was to reduce the tracer fast-path latency caused
by handling minor page faults the first time a given application writes
to each page of the ring buffer after mapping them. The discussion
thread leading to this commit can be found here [1]. When using
LTTng-UST for diagnosing real-time applications with very strict
constraints, this added latency is unwanted.
That commit introduced the MAP_POPULATE flag when mapping the ring
buffer pages, which causes the kernel to pre-populate the page table
entries (PTE).
This has, however, unintended consequences for the following scenarios:
* Short-lived applications which write very little to the ring buffer end
up taking more time to start, because of the time it takes to
pre-populate all the ring buffer pages, even though they typically won't
be used by the application.
* Containerized workloads using cpusets will also end up having longer
application startup time than strictly required, and will populate
PTE for ring buffers of CPUs which are not present in the cpuset.
There are, therefore, two sets of irreconcilable requirements:
short-lived and containerized workloads benefit from lazily populating
the PTE, whereas real-time workloads benefit from pre-populating them.
This will therefore require a tunable environment variable that will let
the end-user choose the behavior for each application.
Solution
--------
Allow users to specify whether they want to pre-populate
shared memory pages within the application with an environment
variable.
LTTNG_UST_MAP_POPULATE_POLICY
If set, override the policy used to populate shared memory pages within the
application. The expected values are:
none
Do not pre-populate any pages, take minor faults on first access while
tracing.
cpu_possible
Pre-populate pages for all possible CPUs in the system, as listed by
/sys/devices/system/cpu/possible.
Default: none. If the policy is unknown, use the default.
Choice of the default
---------------------
Given that users with strict real-time constraints already have to setup
their tracing with specific options (see the "--read-timer"
lttng-enable-channel(3) option [2]), it makes sense that the default
is to lazily populate the ring buffer PTE, and require users with
real-time constraints to explicitly enable the pre-populate through an
environment variable.
Effect on default behavior
--------------------------
The default behavior for ring buffer PTE mapping will be changing across
LTTng-UST versions in the following way:
- 2.10 and earlier: lazily populate PTE,
- 2.11-2.13: pre-populate PTE,
- 2.14: lazily populate PTE.
LTTng-UST 2.14 will revert back to the 2.10 lazy populate scheme by
default.
[1] https://lists.lttng.org/pipermail/lttng-dev/2019-July/thread.html#29094
[2] https://lttng.org/docs/v2.13/#doc-channel-timers
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: I6743b08cd1fe0d956caaf6aad63005555bb9640e