On ARMv7l (Cubietruck), the compiler generates a function call for each
lib_ring_buffer_check_deliver, even though it typically only do an
unlikely check. Split it into an inline fast path, and a function call
for the slow path. This brings a performance gain of about 500ns/event
on the Cubietruck.