From: Jérémie Galarneau Date: Mon, 28 Oct 2019 19:37:34 +0000 (-0400) Subject: Fix: consumerd: NULL pointer dereference during metadata sync X-Git-Tag: v2.12.0-rc1~274 X-Git-Url: https://git.lttng.org./?a=commitdiff_plain;h=cdb72e4e9298c5a5852c6995de7a58fe7fdeaaac;p=lttng-tools.git Fix: consumerd: NULL pointer dereference during metadata sync The following crash was reported when short-lived applications are traced in a live session with per-pid buffering channels. From the original report: ``` Thread 1 (Thread 0x7f72b67fc700 (LWP 1912155)): #0 0x00005650b3f6ccbd in commit_one_metadata_packet (stream=0x7f729c010bf0) at ust-consumer.c:2537 #1 0x00005650b3f6cf58 in lttng_ustconsumer_sync_metadata (ctx=0x5650b588ce60, metadata=0x7f729c010bf0) at ust-consumer.c:2608 #2 0x00005650b3f4dba3 in do_sync_metadata (metadata=0x7f729c010bf0, ctx=0x5650b588ce60) at consumer-stream.c:471 #3 0x00005650b3f4dd3c in consumer_stream_sync_metadata (ctx=0x5650b588ce60, session_id=0) at consumer-stream.c:548 #4 0x00005650b3f6de78 in lttng_ustconsumer_read_subbuffer (stream=0x7f729c0058e0, ctx=0x5650b588ce60) at ust-consumer.c:2917 #5 0x00005650b3f45196 in lttng_consumer_read_subbuffer (stream=0x7f729c0058e0, ctx=0x5650b588ce60) at consumer.c:3524 #6 0x00005650b3f42da7 in consumer_thread_data_poll (data=0x5650b588ce60) at consumer.c:2894 #7 0x00007f72bdc476db in start_thread (arg=0x7f72b67fc700) at pthread_create.c:463 #8 0x00007f72bd97088f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 The segfault happen on the access to 'stream->chan->metadata_cache->lock' chan value here is zero. ``` The problem is easily reproducible if a sleep(1) is added just after the call to lttng_ustconsumer_request_metadata(), before the metadata stream lock is re-acquired. During the execution of the "request_metadata", an application can close. This will cause the session daemon to push any remaining metadata to the consumer daemon and to close the metadata channel. Closing the metadata channel closes the metadata stream's wait_fd, which is an internal pipe. The closure of the metadata pipe is detected by the metadata_poll thread, which will ensure that all metadata has been consumed before issuing the deletion of the metadata stream and channel. During the deletion, the channel's "stream" attribute the stream's "chan" attribute are set to NULL as both are logically deleted and should not longer be used. Meanwhile, the thread executing commit_one_metadata_packet() re-acquires the metadata stream lock and trips on the now-NULL "chan" member. The fix consists in checking if the metadata stream is logically deleted after its lock is re-acquired. It is correct for the sync_metadata operation to then complete successfully as the metadata is synced: the metadata guarantees this before deleting the stream/channel. Since the metadata stream's lifetime is protected by its lock, there may be other sites that need such a check. The lock and deletion check could be combined into a single consumer_stream_lock() helper in follow-up fixes. Reported-by: Jonathan Rajotte Signed-off-by: Jérémie Galarneau --- diff --git a/src/common/consumer/consumer-stream.c b/src/common/consumer/consumer-stream.c index fe1178795..cdda9412e 100644 --- a/src/common/consumer/consumer-stream.c +++ b/src/common/consumer/consumer-stream.c @@ -645,3 +645,13 @@ int consumer_stream_rotate_output_files(struct lttng_consumer_stream *stream) end: return ret; } + +bool consumer_stream_is_deleted(struct lttng_consumer_stream *stream) +{ + /* + * This function does not take a const stream since + * cds_lfht_is_node_deleted was not const before liburcu 0.12. + */ + assert(stream); + return cds_lfht_is_node_deleted(&stream->node.node); +} diff --git a/src/common/consumer/consumer-stream.h b/src/common/consumer/consumer-stream.h index 44a5b0fc1..5f3a34e3e 100644 --- a/src/common/consumer/consumer-stream.h +++ b/src/common/consumer/consumer-stream.h @@ -94,4 +94,13 @@ int consumer_stream_create_output_files(struct lttng_consumer_stream *stream, */ int consumer_stream_rotate_output_files(struct lttng_consumer_stream *stream); +/* + * Indicates whether or not a stream is logically deleted. A deleted stream + * should no longer be used; its existence is only garanteed by the RCU lock + * held by the caller. + * + * This function must be called with the RCU read side lock held. + */ +bool consumer_stream_is_deleted(struct lttng_consumer_stream *stream); + #endif /* LTTNG_CONSUMER_STREAM_H */ diff --git a/src/common/ust-consumer/ust-consumer.c b/src/common/ust-consumer/ust-consumer.c index 46b644d34..27f503405 100644 --- a/src/common/ust-consumer/ust-consumer.c +++ b/src/common/ust-consumer/ust-consumer.c @@ -2582,37 +2582,59 @@ end: * interacting with sessiond, else we cause a deadlock with live * awaiting on metadata to be pushed out. * + * The RCU read side lock must be held by the caller. + * * Return 0 if new metadatda is available, EAGAIN if the metadata stream * is empty or a negative value on error. */ int lttng_ustconsumer_sync_metadata(struct lttng_consumer_local_data *ctx, - struct lttng_consumer_stream *metadata) + struct lttng_consumer_stream *metadata_stream) { int ret; int retry = 0; + struct lttng_consumer_channel *metadata_channel; assert(ctx); - assert(metadata); + assert(metadata_stream); - pthread_mutex_unlock(&metadata->lock); + metadata_channel = metadata_stream->chan; + pthread_mutex_unlock(&metadata_stream->lock); /* * Request metadata from the sessiond, but don't wait for the flush * because we locked the metadata thread. */ - ret = lttng_ustconsumer_request_metadata(ctx, metadata->chan, 0, 0); - pthread_mutex_lock(&metadata->lock); + ret = lttng_ustconsumer_request_metadata(ctx, metadata_channel, 0, 0); + pthread_mutex_lock(&metadata_stream->lock); if (ret < 0) { goto end; } - ret = commit_one_metadata_packet(metadata); + /* + * The metadata stream and channel can be deleted while the + * metadata stream lock was released. The streamed is checked + * for deletion before we use it further. + * + * Note that it is safe to access a logically-deleted stream since its + * existence is still guaranteed by the RCU read side lock. However, + * it should no longer be used. The close/deletion of the metadata + * channel and stream already guarantees that all metadata has been + * consumed. Therefore, there is nothing left to do in this function. + */ + if (consumer_stream_is_deleted(metadata_stream)) { + DBG("Metadata stream %" PRIu64 " was deleted during the metadata synchronization", + metadata_stream->key); + ret = 0; + goto end; + } + + ret = commit_one_metadata_packet(metadata_stream); if (ret <= 0) { goto end; } else if (ret > 0) { retry = 1; } - ret = ustctl_snapshot(metadata->ustream); + ret = ustctl_snapshot(metadata_stream->ustream); if (ret < 0) { if (errno != EAGAIN) { ERR("Sync metadata, taking UST snapshot");