From: Jason Wessel Date: Wed, 27 Apr 2011 20:22:14 +0000 (+0200) Subject: ust-consumerd: fix exit race crashes X-Git-Tag: v1.9.1~380 X-Git-Url: https://git.lttng.org./?a=commitdiff_plain;h=5343b2860738c675311ce5949bdf9e31afc76fa4;p=lttng-ust.git ust-consumerd: fix exit race crashes The ust-consumerd gets shutdown by the SIGTERM signal and a number of places in the ust-consumerd did not properly deal with the case where a system call returns EINTR in errno as a result of a signal to the process. The failure to handle EINTR properly was leading to some data corruption in the buffer code and causing some random "victim" crashes in lowlevel.c The way all the offending functions were tracked down was to temporarily add an abort() in the SIGTERM signal handler. Then it was a matter of looking at what threads were blocked on system calls at the time outside of the thread that received the signal. Signed-off-by: Jason Wessel Signed-off-by: Nils Carlson --- diff --git a/libustconsumer/libustconsumer.c b/libustconsumer/libustconsumer.c index c5acffa0..abf21d80 100644 --- a/libustconsumer/libustconsumer.c +++ b/libustconsumer/libustconsumer.c @@ -477,6 +477,8 @@ int consumer_loop(struct ustconsumer_instance *instance, struct buffer_info *buf DBG("App died while being traced"); finish_consuming_dead_subbuffer(instance->callbacks, buf); break; + } else if (read_result == -1 && errno == EINTR) { + continue; } if(instance->callbacks->on_read_subbuffer) @@ -783,8 +785,11 @@ int ustconsumer_stop_instance(struct ustconsumer_instance *instance, int send_ms struct sockaddr_un addr; +socket_again: result = fd = socket(PF_UNIX, SOCK_STREAM, 0); if(result == -1) { + if (errno == EINTR) + goto socket_again; PERROR("socket"); return 1; } @@ -794,13 +799,21 @@ int ustconsumer_stop_instance(struct ustconsumer_instance *instance, int send_ms strncpy(addr.sun_path, instance->sock_path, UNIX_PATH_MAX); addr.sun_path[UNIX_PATH_MAX-1] = '\0'; +connect_again: result = connect(fd, (struct sockaddr *)&addr, sizeof(addr)); if(result == -1) { + if (errno == EINTR) + goto connect_again; PERROR("connect"); } - while(bytes != sizeof(msg)) - bytes += send(fd, msg, sizeof(msg), 0); + while(bytes != sizeof(msg)) { + int inc = send(fd, msg, sizeof(msg), 0); + if (inc < 0 && errno != EINTR) + break; + else + bytes += inc; + } close(fd); diff --git a/ust-consumerd/ust-consumerd.c b/ust-consumerd/ust-consumerd.c index ce2ee40a..c9613945 100644 --- a/ust-consumerd/ust-consumerd.c +++ b/ust-consumerd/ust-consumerd.c @@ -210,7 +210,11 @@ int on_open_buffer(struct ustconsumer_callbacks *data, struct buffer_info *buf) trace_path, buf->pid, buf->pidunique, buf->name); return 1; } +again: result = fd = open(tmp, O_WRONLY | O_CREAT | O_TRUNC | O_EXCL, 00600); + if (result == -1 && errno == EINTR) + goto again; + if(result == -1) { PERROR("open"); ERR("failed opening trace file %s", tmp); @@ -225,7 +229,12 @@ int on_open_buffer(struct ustconsumer_callbacks *data, struct buffer_info *buf) int on_close_buffer(struct ustconsumer_callbacks *data, struct buffer_info *buf) { struct buffer_info_local *buf_local = buf->user_data; - int result = close(buf_local->file_fd); + int result; + +again: + result = close(buf_local->file_fd); + if (result == -1 && errno == EINTR) + goto again; free(buf_local); if(result == -1) { PERROR("close");