From: Michael Jeanson Date: Wed, 29 Nov 2017 22:03:21 +0000 (-0500) Subject: timer API transition for kernel 4.15 X-Git-Tag: v2.11.0-rc1~95 X-Git-Url: https://git.lttng.org./?a=commitdiff_plain;h=1fd97f9f4f773e3e9dfc787e9c90b1418fa5a7d4;p=lttng-modules.git timer API transition for kernel 4.15 The timer API changes starting from kernel 4.15.0. There's an interresting LWN article on this subject: https://lwn.net/Articles/735887/ Check these upstream commits for more details: commit 686fef928bba6be13cabe639f154af7d72b63120 Author: Kees Cook Date: Thu Sep 28 06:38:17 2017 -0700 timer: Prepare to change timer callback argument type Modern kernel callback systems pass the structure associated with a given callback to the callback function. The timer callback remains one of the legacy cases where an arbitrary unsigned long argument continues to be passed as the callback argument. This has several problems: - This bloats the timer_list structure with a normally redundant .data field. - No type checking is being performed, forcing callbacks to do explicit type casts of the unsigned long argument into the object that was passed, rather than using container_of(), as done in most of the other callback infrastructure. - Neighboring buffer overflows can overwrite both the .function and the .data field, providing attackers with a way to elevate from a buffer overflow into a simplistic ROP-like mechanism that allows calling arbitrary functions with a controlled first argument. - For future Control Flow Integrity work, this creates a unique function prototype for timer callbacks, instead of allowing them to continue to be clustered with other void functions that take a single unsigned long argument. This adds a new timer initialization API, which will ultimately replace the existing setup_timer(), setup_{deferrable,pinned,etc}_timer() family, named timer_setup() (to mirror hrtimer_setup(), making instances of its use much easier to grep for). In order to support the migration of existing timers into the new callback arguments, timer_setup() casts its arguments to the existing legacy types, and explicitly passes the timer pointer as the legacy data argument. Once all setup_*timer() callers have been replaced with timer_setup(), the casts can be removed, and the data argument can be dropped with the timer expiration code changed to just pass the timer to the callback directly. : Modern kernel callback systems pass the structure associated with a given callback to the callback function. The timer callback remains one of the legacy cases where an arbitrary unsigned long argument continues to be passed as the callback argument. This has several problems: - This bloats the timer_list structure with a normally redundant .data field. - No type checking is being performed, forcing callbacks to do explicit type casts of the unsigned long argument into the object that was passed, rather than using container_of(), as done in most of the other callback infrastructure. - Neighboring buffer overflows can overwrite both the .function and the .data field, providing attackers with a way to elevate from a buffer overflow into a simplistic ROP-like mechanism that allows calling arbitrary functions with a controlled first argument. - For future Control Flow Integrity work, this creates a unique function prototype for timer callbacks, instead of allowing them to continue to be clustered with other void functions that take a single unsigned long argument. This adds a new timer initialization API, which will ultimately replace the existing setup_timer(), setup_{deferrable,pinned,etc}_timer() family, named timer_setup() (to mirror hrtimer_setup(), making instances of its use much easier to grep for). In order to support the migration of existing timers into the new callback arguments, timer_setup() casts its arguments to the existing legacy types, and explicitly passes the timer pointer as the legacy data argument. Once all setup_*timer() callers have been replaced with timer_setup(), the casts can be removed, and the data argument can be dropped with the timer expiration code changed to just pass the timer to the callback directly. Since the regular pattern of using container_of() during local variable declaration repeats the need for the variable type declaration to be included, this adds a helper modeled after other from_*() helpers that wrap container_of(), named from_timer(). This helper uses typeof(*variable), removing the type redundancy and minimizing the need for line wraps in forthcoming conversions from "unsigned data long" to "struct timer_list *" in the timer callbacks: -void callback(unsigned long data) +void callback(struct timer_list *t) { - struct some_data_structure *local = (struct some_data_structure *)data; + struct some_data_structure *local = from_timer(local, t, timer); Finally, in order to support the handful of timer users that perform open-coded assignments of the .function (and .data) fields, provide cast macros (TIMER_FUNC_TYPE and TIMER_DATA_TYPE) that can be used temporarily. Once conversion has been completed, these can be globally trivially removed. ... commit e99e88a9d2b067465adaa9c111ada99a041bef9a Author: Kees Cook Date: Mon Oct 16 14:43:17 2017 -0700 treewide: setup_timer() -> timer_setup() This converts all remaining cases of the old setup_timer() API into using timer_setup(), where the callback argument is the structure already holding the struct timer_list. These should have no behavioral changes, since they just change which pointer is passed into the callback with the same available pointers after conversion. It handles the following examples, in addition to some other variations. ... commit 185981d54a60ae90942c6ba9006b250f3348cef2 Author: Kees Cook Date: Wed Oct 4 16:26:58 2017 -0700 timer: Remove init_timer_pinned() in favor of timer_setup() This refactors the only users of init_timer_pinned() to use the new timer_setup() and from_timer(). Drops the definition of init_timer_pinned(). ... Signed-off-by: Michael Jeanson Signed-off-by: Mathieu Desnoyers --- diff --git a/lib/ringbuffer/ring_buffer_frontend.c b/lib/ringbuffer/ring_buffer_frontend.c index bdd31add..abd97576 100644 --- a/lib/ringbuffer/ring_buffer_frontend.c +++ b/lib/ringbuffer/ring_buffer_frontend.c @@ -314,9 +314,9 @@ free_chanbuf: return ret; } -static void switch_buffer_timer(unsigned long data) +static void switch_buffer_timer(LTTNG_TIMER_FUNC_ARG_TYPE t) { - struct lib_ring_buffer *buf = (struct lib_ring_buffer *)data; + struct lib_ring_buffer *buf = lttng_from_timer(buf, t, switch_timer); struct channel *chan = buf->backend.chan; const struct lib_ring_buffer_config *config = &chan->backend.config; @@ -341,22 +341,22 @@ static void lib_ring_buffer_start_switch_timer(struct lib_ring_buffer *buf) { struct channel *chan = buf->backend.chan; const struct lib_ring_buffer_config *config = &chan->backend.config; + unsigned int flags = 0; if (!chan->switch_timer_interval || buf->switch_timer_enabled) return; if (config->alloc == RING_BUFFER_ALLOC_PER_CPU) - lttng_init_timer_pinned(&buf->switch_timer); - else - init_timer(&buf->switch_timer); + flags = LTTNG_TIMER_PINNED; - buf->switch_timer.function = switch_buffer_timer; + lttng_timer_setup(&buf->switch_timer, switch_buffer_timer, flags, buf); buf->switch_timer.expires = jiffies + chan->switch_timer_interval; - buf->switch_timer.data = (unsigned long)buf; + if (config->alloc == RING_BUFFER_ALLOC_PER_CPU) add_timer_on(&buf->switch_timer, buf->backend.cpu); else add_timer(&buf->switch_timer); + buf->switch_timer_enabled = 1; } @@ -377,9 +377,9 @@ static void lib_ring_buffer_stop_switch_timer(struct lib_ring_buffer *buf) /* * Polling timer to check the channels for data. */ -static void read_buffer_timer(unsigned long data) +static void read_buffer_timer(LTTNG_TIMER_FUNC_ARG_TYPE t) { - struct lib_ring_buffer *buf = (struct lib_ring_buffer *)data; + struct lib_ring_buffer *buf = lttng_from_timer(buf, t, read_timer); struct channel *chan = buf->backend.chan; const struct lib_ring_buffer_config *config = &chan->backend.config; @@ -406,6 +406,7 @@ static void lib_ring_buffer_start_read_timer(struct lib_ring_buffer *buf) { struct channel *chan = buf->backend.chan; const struct lib_ring_buffer_config *config = &chan->backend.config; + unsigned int flags; if (config->wakeup != RING_BUFFER_WAKEUP_BY_TIMER || !chan->read_timer_interval @@ -413,18 +414,16 @@ static void lib_ring_buffer_start_read_timer(struct lib_ring_buffer *buf) return; if (config->alloc == RING_BUFFER_ALLOC_PER_CPU) - lttng_init_timer_pinned(&buf->read_timer); - else - init_timer(&buf->read_timer); + flags = LTTNG_TIMER_PINNED; - buf->read_timer.function = read_buffer_timer; + lttng_timer_setup(&buf->read_timer, read_buffer_timer, flags, buf); buf->read_timer.expires = jiffies + chan->read_timer_interval; - buf->read_timer.data = (unsigned long)buf; if (config->alloc == RING_BUFFER_ALLOC_PER_CPU) add_timer_on(&buf->read_timer, buf->backend.cpu); else add_timer(&buf->read_timer); + buf->read_timer_enabled = 1; } diff --git a/wrapper/timer.h b/wrapper/timer.h index c1c0c952..4fc9828c 100644 --- a/wrapper/timer.h +++ b/wrapper/timer.h @@ -27,30 +27,76 @@ #include #include +/* + * In the olden days, pinned timers were initialized normaly with init_timer() + * and then modified with mod_timer_pinned(). + * + * Then came kernel 4.8.0 and they had to be initilized as pinned with + * init_timer_pinned() and then modified as regular timers with mod_timer(). + * + * Then came kernel 4.15.0 with a new timer API where init_timer() is no more. + * It's replaced by timer_setup() where pinned is now part of timer flags. + */ + + +#if (LINUX_VERSION_CODE >= KERNEL_VERSION(4,15,0)) + +#define LTTNG_TIMER_PINNED TIMER_PINNED +#define LTTNG_TIMER_FUNC_ARG_TYPE struct timer_list * + +#define lttng_mod_timer_pinned(timer, expires) \ + mod_timer(timer, expires) + +#define lttng_from_timer(var, callback_timer, timer_fieldname) \ + from_timer(var, callback_timer, timer_fieldname) + +#define lttng_timer_setup(timer, callback, flags, unused) \ + timer_setup(timer, callback, flags) + -#if (LTTNG_RT_VERSION_CODE >= LTTNG_RT_KERNEL_VERSION(4,6,4,8) \ +#else /* LINUX_VERSION_CODE >= KERNEL_VERSION(4,15,0) */ + + +# if (LTTNG_RT_VERSION_CODE >= LTTNG_RT_KERNEL_VERSION(4,6,4,8) \ || LINUX_VERSION_CODE >= KERNEL_VERSION(4,8,0)) -#define lttng_init_timer_pinned(timer) \ +#define lttng_init_timer_pinned(timer) \ init_timer_pinned(timer) -static inline int lttng_mod_timer_pinned(struct timer_list *timer, - unsigned long expires) -{ - return mod_timer(timer, expires); -} +#define lttng_mod_timer_pinned(timer, expires) \ + mod_timer(timer, expires) -#else +# else /* LTTNG_RT_VERSION_CODE >= LTTNG_RT_KERNEL_VERSION(4,6,4,8) */ -#define lttng_init_timer_pinned(timer) \ +#define lttng_init_timer_pinned(timer) \ init_timer(timer) -static inline int lttng_mod_timer_pinned(struct timer_list *timer, - unsigned long expires) +#define lttng_mod_timer_pinned(timer, expires) \ + mod_timer_pinned(timer, expires) + +# endif /* LTTNG_RT_VERSION_CODE >= LTTNG_RT_KERNEL_VERSION(4,6,4,8) */ + + +#define LTTNG_TIMER_PINNED TIMER_PINNED +#define LTTNG_TIMER_FUNC_ARG_TYPE unsigned long + +/* timer_fieldname is unused prior to 4.15. */ +#define lttng_from_timer(var, timer_data, timer_fieldname) \ + ((typeof(var))timer_data) + +static inline void lttng_timer_setup(struct timer_list *timer, + void (*function)(LTTNG_TIMER_FUNC_ARG_TYPE), + unsigned int flags, void *data) { - return mod_timer_pinned(timer, expires); + if (flags & LTTNG_TIMER_PINNED) + lttng_init_timer_pinned(timer); + else + init_timer(timer); + + timer->function = function; + timer->data = (unsigned long)data; } -#endif +#endif /* LINUX_VERSION_CODE >= KERNEL_VERSION(4,15,0) */ #endif /* _LTTNG_WRAPPER_TIMER_H */