urcu: avoid false sharing for rcu_gp_ctr
@rcu_gp_ctr and @registry share the same cache line, it causes
false sharing and slowdown both of the read site and update site.
Fix: Use different cache line for them.
Although rcu_gp_futex is updated less than rcu_gp_ctr, but
they always be accessed at almost the same time, so we also move rcu_gp_futex
to the cacheline of rcu_gp_ctr to reduce the cacheline-usage or cache-missing
of read site.
test: (4X6=24 CPUs)
Before patch:
[root@localhost userspace-rcu]# ./tests/test_urcu_mb 20 1 20
SUMMARY ./tests/test_urcu_mb testdur 20 nr_readers 20 rdur 0 wdur 0 nr_writers 1 wdelay 0 nr_reads
2100285330 nr_writes
3390219 nr_ops
2103675549
[root@localhost userspace-rcu]# ./tests/test_urcu_mb 20 1 20
SUMMARY ./tests/test_urcu_mb testdur 20 nr_readers 20 rdur 0 wdur 0 nr_writers 1 wdelay 0 nr_reads
1619868562 nr_writes
3529478 nr_ops
1623398040
[root@localhost userspace-rcu]# ./tests/test_urcu_mb 20 1 20
SUMMARY ./tests/test_urcu_mb testdur 20 nr_readers 20 rdur 0 wdur 0 nr_writers 1 wdelay 0 nr_reads
1949067038 nr_writes
3469334 nr_ops
1952536372
after patch:
[root@localhost userspace-rcu]# ./tests/test_urcu_mb 20 1 20
SUMMARY ./tests/test_urcu_mb testdur 20 nr_readers 20 rdur 0 wdur 0 nr_writers 1 wdelay 0 nr_reads
3380191848 nr_writes
4903248 nr_ops
3385095096
[root@localhost userspace-rcu]# ./tests/test_urcu_mb 20 1 20
SUMMARY ./tests/test_urcu_mb testdur 20 nr_readers 20 rdur 0 wdur 0 nr_writers 1 wdelay 0 nr_reads
3397637486 nr_writes
4129809 nr_ops
3401767295
Singed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>