From: Jérémie Galarneau Date: Tue, 24 Sep 2019 05:10:58 +0000 (-0400) Subject: Fix: sessiond: fs.protected_regular sysctl breaks app registration X-Git-Tag: v2.10.8~15 X-Git-Url: https://git.lttng.org./?a=commitdiff_plain;h=0052cc6026ed9cba2903b8bfec7e283fe99cbf22;p=lttng-tools.git Fix: sessiond: fs.protected_regular sysctl breaks app registration I observed that userspace tracing no longer worked when an instrumented application (linked against liblttng-ust) was launched before the session daemon. While investigating this, I noticed that the shm_open() of '/lttng-ust-wait-8' failed with EACCES. As the permissions on the '/dev/shm' directory and the file itself should have allowed the session daemon to open the shm, this pointed to a change in kernel behaviour. Moreover, it appeared that this could only be reproduced on my system (running Arch Linux) and not on other systems. It turns out that Linux 4.19 introduces a new protected_regular sysctl to allow the mitigation of a class of TOCTOU security issues related to the creation of files and FIFOs in sticky directories. When this sysctl is not set to '0', it specifically blocks the way the session daemon attempts to open the app notification shm that an application has already created. To quote a comment added in linux's fs/namei.c as part of 30aba6656f: ``` Block an O_CREAT open of a FIFO (or a regular file) when: - sysctl_protected_fifos (or sysctl_protected_regular) is enabled - the file already exists - we are in a sticky directory - we don't own the file - the owner of the directory doesn't own the file - the directory is world writable ``` While the concerns that led to the inclusion of this patch are valid, the risks that are being mitigated do not apply to the session daemon's and instrumented application's use of this shm. This shm is only used to wake-up applications and get them to attempt to connect to the session daemon's application socket. The application socket is the part that is security sensitive. At worst, an attacker controlling this shm could wake up the UST thread in applications which would then attempt to connect to the session daemon. Unfortunately (for us, at least), systemd v241+ sets the protected_regular sysctl to 1 by default (see systemd commit 27325875), causing the open of the shm by the session daemon to fail. Introduce a fall-back to attempt a shm_open without the O_CREAT flag when opening it with 'O_RDWR | O_CREAT' fails. The comments detail the reason why those attempts are made in that specific order. Signed-off-by: Jérémie Galarneau --- diff --git a/src/bin/lttng-sessiond/shm.c b/src/bin/lttng-sessiond/shm.c index 23bd9db20..60a92cd15 100644 --- a/src/bin/lttng-sessiond/shm.c +++ b/src/bin/lttng-sessiond/shm.c @@ -70,11 +70,44 @@ static int get_wait_shm(char *shm_path, size_t mmap_size, int global) /* * Try creating shm (or get rw access). We don't do an exclusive open, * because we allow other processes to create+ftruncate it concurrently. + * + * A sysctl, fs.protected_regular may prevent the session daemon from + * opening a previously created shm when the O_CREAT flag is provided. + * Systemd enables this ABI-breaking change by default since v241. + * + * First, attempt to use the create-or-open semantic that is + * desired here. If this fails with EACCES, work around this broken + * behaviour and attempt to open the shm without the O_CREAT flag. + * + * The two attempts are made in this order since applications are + * expected to race with the session daemon to create this shm. + * Attempting an shm_open() without the O_CREAT flag first could fail + * because the file doesn't exist. It could then be created by an + * application, which would cause a second try with the O_CREAT flag to + * fail with EACCES. + * + * Note that this introduces a new failure mode where a user could + * launch an application (creating the shm) and unlink the shm while + * the session daemon is launching, causing the second attempt + * to fail. This is not recovered-from as unlinking the shm will + * prevent userspace tracing from succeeding anyhow: the sessiond would + * use a now-unlinked shm, while the next application would create + * a new named shm. */ wait_shm_fd = shm_open(shm_path, O_RDWR | O_CREAT, mode); if (wait_shm_fd < 0) { - PERROR("Failed to open wait shm at %s", shm_path); - goto error; + if (errno == EACCES) { + /* Work around sysctl fs.protected_regular. */ + DBG("shm_open of %s returned EACCES, this may be caused " + "by the fs.protected_regular sysctl. " + "Attempting to open the shm without " + "creating it.", shm_path); + wait_shm_fd = shm_open(shm_path, O_RDWR, mode); + } + if (wait_shm_fd < 0) { + PERROR("Failed to open wait shm at %s", shm_path); + goto error; + } } ret = ftruncate(wait_shm_fd, mmap_size);