Re-write ustcomm parts of UST v2
Changes since v1:
Updated after comments from David Goulet and resulting insights.
* Add a continue after a failed accept
* Fix some malloc issues
* Fix some coding style in the patch
* make del_named_sock free the memory, even if all else fails.
Notice: valgrind test-case currently broken. Needs an exception.
Description:
This is a very big patch, and so it requires a bit of explaining.
This patch is a step on the way of accomplishing serveral goals I have in this
area:
1. Use enums for commands and eliminate text-based commands. This does not mean
that we will stop processing strings for trace/channel and marker names;
just that the long series of if statements with token and string matching
will be replaced with a switch statement. To this end I have created a
ustcomm_header struct that contains the length of the data-field and some
other fields. This allows us to first receive the header, allocate memory
for the data and then receive the data; eliminating all scanning of messages.
2. Reduce the complexity of the implementation. To put it simply, I don't like
callbacks. They reduce transparency and make it difficult to follow the
flow of the code; so I have eliminated multipoll replacing it with a normal
epoll. I have also replaced almost all the different server, connection and
source structs with one, called ustcomm_sock.
3. Make ustd scale better. Currently ustd scales terribly. We allocate one
thread per-cpu per-channel per-process, five applications each with three
channels on a four cpu machine leads to 5*3*4=60 threads. Part of the reason
for this multitude of threads was that we used a ustcomm_request call
(consisting of a send and a receive) to wait for a subbuffer to be written.
The sequence for a subbuffer to be written was as follows:
Ustd calls send with a 'get_subbuffer' command, and then recv in one of
the threads and hangs on the recv on the socket.
Upon filling the subbuffer the traced app writes '1' to a pipe.
The ust_thread inside the app which was listening to the other end of the
pipe wakes up when the '1' is written. The callback from multipoll calls
a send which sends a reply to the ustd thread over the socket.
The ustd thread wakes up and reads the message, continuing along in its
execution.
I replace this with a bit of a different mechanism, which should allow us
to eventually reduce the number of threads to one per cpu:
Ustd requests a buffer_fd which causes the ustd_thread inside the app
to send the file-descriptor for the read en of the pipe to ustd.
The ustd thread now does a read on the pipe, halting its execution until
the app fills the subbuffer and writes '1' to the pipe, waking up the ustd
thread.
Ustd now makes the 'get_subbuffer' call which the ust_thread inside the
app responds to with information about the subbuffer. Writes it and then
goes back to the read call, hanging on the pipe.
So we are still stuck on the multitude of threads, but we are in much better
position to move forward. Replacing the read with an epoll statement and then
pointing the epoll event data at the buffer struct containing the current
buffer to whitch the pipe belongs should be relatively easy. We can then
instead of spawning a new thread for each buffer just allocate the
buffer_info struct and assign it to one of the per-cpu threads in ustd to
poll on.
4. Replace poll with epoll which scales better, especially for
events << (nr of fds). This is complete.
5. Allow UST to handle arbitrarily long unix socket names. This is done by
carefull allocation of the socketaddr_un struct with a dynamic length.
Truncating is ugly and dangerous.
There is a lot of work still left to be done. This is only the first of a
number of patches that I expect in this area. If someone feels like tackling
ustd head on to reduce the number of threads that would be great.
I have kept Pierre-Marc's form of error handling for the I/O wrapping functions
because I want to propagate return codes up to the apps that are using them
so they can close file-descriptors and free associated resources. If somebody
knows of a better approach please make yourself heard.
Signed-off-by: Nils Carlson <nils.carlson@ericsson.com>
Acked-by: David Goulet <david.goulet@polymtl.ca>
This page took 0.026868 seconds and 4 git commands to generate.