Observed issue
==============
In the CI, this test would intermittently fail. During failures,
the calculated pipe size from the `default_pipe_size_getter`
application was 8192, while in other cases it was 65536.
```
ERROR: tools/notification/test_notification_notifier_discarded_count
====================================================================
1..41
ok 1 - Add trigger my_trigger
PASS: tools/notification/test_notification_notifier_discarded_count 1 - Add trigger my_trigger
---
duration_ms: 1323.966137
...
ok 2 - No discarded tracer notification
PASS: tools/notification/test_notification_notifier_discarded_count 2 - No discarded tracer notification
---
duration_ms: 22.021590
...
ok 3 - Generating 390 tracer notifications
PASS: tools/notification/test_notification_notifier_discarded_count 3 - Generating 390 tracer notifications
---
duration_ms: 154.790871
...
not ok 4 - Discarded tracer notification number non-zero (0) as expected
FAIL: tools/notification/test_notification_notifier_discarded_count 4 - Discarded tracer notification number non-zero (0) as expected
---
duration_ms: 24.323759
...
```
Cause
=====
The initial size of pipes in linux may have different values:
1) `16 * PAGE_SIZE` (as documented in `man 7 pipe`) (since Linux 2.6.11)
2) When a user has many pipes open and is above a soft limit:
* `2 * PAGE_SIZE` (undocumented, see[1]), as of Linux 5.14[2]
* `1 * PAGE_SIZE` since linux 2.6.35[3]
As the program `default_pipe_size_getter` opened a pipe to check it's
size, there could be times in a system where a user has many pipe
buffers open beyond the soft limit and the lower value would be
returned; however, the previously opened sessiond may have had a pipe
opened with the larger default pipe size.
Solution
========
Use the maximum page size (on Linux, from
`/proc/sys/fs/pipe-max-size`) for the estimated pipe size rather than
opening a pipe and checking it's size.
Known drawbacks
===============
When the maximum pipe size value is much larger than the actual size
of the notification pipe, many more events are emitted than is
necessary to complete the test.
References
==========
[1]: https://gitlab.com/linux-kernel/stable/-/blob/
3e9bff3bbe1355805de919f688bef4baefbfd436/fs/pipe.c#L809
[2]: See upstream commit
46c4c9d1beb7f5b4cec4dd90e7728720583ee348
[3]: See upstream commit
6a6ca57de92fcae34603551ac944aa74758c30d4
Change-Id: Id547a1d772b5a7f9b18ffa686ff6644afca4ab15
Signed-off-by: Kienan Stewart <kstewart@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
diag "UST event notifer error counter"
- PIPE_SIZE=$("$CURDIR"/default_pipe_size_getter)
- if [ $? -ne 0 ]; then
- BAIL_OUT "Failed to get system default pipe size"
- else
- diag "Default system pipe size: $PIPE_SIZE bytes"
- fi
+ PIPE_SIZE=$(get_pipe_max_size)
# Find the number of events needed to overflow the event notification
# pipe buffer. Each LTTng-UST notification is at least 42 bytes long.
useradd --no-create-home "$new_user"
new_uid=$(id -u "$new_user")
- PIPE_SIZE=$("$CURDIR"/default_pipe_size_getter)
- if [ $? -ne 0 ]; then
- BAIL_OUT "Failed to get system default pipe size"
- else
- diag "Default system pipe size: $PIPE_SIZE bytes"
- fi
+ PIPE_SIZE=$(get_pipe_max_size)
# Find the number of events needed to overflow the event notification
# pipe buffer. Each LTTng-UST notification is at least 42 bytes long.
local NR_ITER
local new_user="dummy_lttng_test_user"
- PIPE_SIZE=$("$CURDIR"/default_pipe_size_getter)
- if [ $? -ne 0 ]; then
- BAIL_OUT "Failed to get system default pipe size"
- else
- diag "Default system pipe size: $PIPE_SIZE bytes"
- fi
+ PIPE_SIZE=$(get_pipe_max_size)
# Find the number of events needed to overflow the event notification
# pipe buffer. Each LTTng-UST notification is at least 42 bytes long.
echo
}
+function get_pipe_max_size()
+{
+ if grep -q 'FreeBSD' /etc/os-release ; then
+ # Kernel configuration dependant, but defaults to 64 * 1024
+ # https://github.com/freebsd/freebsd-src/blob/5b0dc991093c82824f6fe566af947f64f5072264/sys/sys/pipe.h#L33
+ echo 65536
+ else
+ cat /proc/sys/fs/pipe-max-size
+ fi
+}
+
# Return a space-separated string of online CPU IDs, based on
# /sys/devices/system/cpu/online, or from 0 to nproc - 1 otherwise.
function get_online_cpus()