From: Jérémie Galarneau Date: Tue, 14 Dec 2021 19:52:24 +0000 (-0500) Subject: Fix: relayd comm: improperly packed rotate streams command header X-Git-Url: http://git.lttng.org./?a=commitdiff_plain;ds=inline;h=ce9dbd47eea9fe023691518ef46b58c03b89c236;hp=ce9dbd47eea9fe023691518ef46b58c03b89c236;p=lttng-tools.git Fix: relayd comm: improperly packed rotate streams command header Observed issue ============== The implicit rotation of a session performed during its destruction fails on LTTng 2.12 (and upwards) with the following errors: Error: Relayd rotate streams replied error 152 Error: Relayd rotate stream failed. Cleaning up relayd 2 Error: Rotate channel failed Failed to find relay daemon socket: relayd_id = 2 Error: Failed to perform a quiet rotation as part of the destruction of session "my_session": Rotation failure on consumer Cause ===== Error 152 matches the LTTNG_ERR_INVALID_PROTOCOL error, which implies that the consumer daemon sent an unexpected command to the relay daemon. It was determined that the RELAYD_ROTATE_STREAMS command header is not properly packed since the LTTNG_PACKED annotation was omitted from its `new_chunk_id` optional field. The documentation of LTTNG_OPTIONAL_COMM duly indicates that this is required. Without the use of LTTNG_PACKED, various lengths of padding (3 or 7 bytes) are inserted between new_chunk_id's `is_set` and `value` field to align `value`, which results in an incorrect interpretation of the command's arguments. The relay daemon catches the protocol error when it is built in a configuration that inserts 7 bytes of padding, while the consumer only inserts three. Solution ======== The solution proposed here is not perfect, see "Known drawbacks". First, if we were to annotate the field, patched consumer daemons would send unintelligible command headers to unpatched relay daemons. Leaving it as is is the least of all evils, see "Known drawbacks" for more details. From the relay daemon end, we can easily anticipate the padding problem by reading the `stream_count` field and use it to determine the expected size of the payload. The difference between the actual size of the payload and the expected size allows us to determine the padding size and use the appropriate declaration of the structure to interpret the command's arguments. Known drawbacks =============== While this fix causes the relay daemon to handle all improperly packed command headers received from an unpatched consumer daemon, the reverse is not completely true. The following tables show which cases are known to work and which are known to be broken when patched and unpatched versions of the relay and consumer daemons are mixed, with the various alignment constraints. Note that here: - 4 byte alignement implies "daemon running on an architecture on which uint64_t is aligned on an 4-byte boundary" (e.g. x86), - 8 byte alignement implies "daemon running on an architecture on which uint64_t is aligned on an 8-byte boundary" (e.g. x86-64). Scenario 1 - Unpatched relay daemon and unpatched consumer daemon ----------------------------------------------------------------------------------- | Architecture alignment | 4 byte alignement consumerd | 8 byte alignment consumerd | |------------------------|-----------------------------|----------------------------| | 4 byte alignment relay | Works | Fails | | 8 byte alignment relay | Fails | Works | ----------------------------------------------------------------------------------- Scenario 2 - Patched relay daemon and unpatched consumer daemon ----------------------------------------------------------------------------------- | Architecture alignment | 4 byte alignement consumerd | 8 byte alignment consumerd | |------------------------|-----------------------------|----------------------------| | 4 byte alignment relay | Works | Works | | 8 byte alignment relay | Works | Works | ----------------------------------------------------------------------------------- Scenario 3 - Unpatched relay daemon and patched consumer daemon ----------------------------------------------------------------------------------- | Architecture alignment | 4 byte alignement consumerd | 8 byte alignment consumerd | |------------------------|-----------------------------|----------------------------| | 4 byte alignment relay | Works | Fails | | 8 byte alignment relay | Fails | Works | ----------------------------------------------------------------------------------- Scenario 4 - Patched relay daemon and patched consumer daemon ----------------------------------------------------------------------------------- | Architecture alignment | 4 byte alignement consumerd | 8 byte alignment consumerd | |------------------------|-----------------------------|----------------------------| | 4 byte alignment relay | Works | Works | | 8 byte alignment relay | Works | Works | ----------------------------------------------------------------------------------- Note that Scenarios 1 and 3 are the same since this fix doesn't change the behaviour of the consumer daemon. Also note that packing the `new_chunk_id` field would break the two working cases of scenario 3 which are, in all likelyhood, the more common cases. A new command using a properly packed version of the command's header could be implemented in future versions, but this presents no benefit as part of this fix. Signed-off-by: Jérémie Galarneau Change-Id: I3b822a9766ea5be51996a771a0d4a90efe29ce0b ---