urcu.git
11 years agohlist: implement cds_hlist_empty
Mathieu Desnoyers [Fri, 24 Aug 2012 17:21:49 +0000 (13:21 -0400)] 
hlist: implement cds_hlist_empty

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agorcuja: partial implementation of cds_ja_del
Mathieu Desnoyers [Thu, 23 Aug 2012 19:31:06 +0000 (15:31 -0400)] 
rcuja: partial implementation of cds_ja_del

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agorcuja: implement ja_node_clear_nth
Mathieu Desnoyers [Wed, 22 Aug 2012 16:58:17 +0000 (12:58 -0400)] 
rcuja: implement ja_node_clear_nth

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agorcuja: extend tests, more fixes
Mathieu Desnoyers [Tue, 21 Aug 2012 21:38:27 +0000 (17:38 -0400)] 
rcuja: extend tests, more fixes

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agorcuja: fix max depth test
Mathieu Desnoyers [Tue, 21 Aug 2012 14:42:38 +0000 (10:42 -0400)] 
rcuja: fix max depth test

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agorcuja: swap key
Mathieu Desnoyers [Tue, 21 Aug 2012 14:40:44 +0000 (10:40 -0400)] 
rcuja: swap key

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agorcuja: add fallback nodes
Mathieu Desnoyers [Tue, 21 Aug 2012 13:08:46 +0000 (09:08 -0400)] 
rcuja: add fallback nodes

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agorcuja: various fixes
Mathieu Desnoyers [Tue, 21 Aug 2012 00:25:37 +0000 (20:25 -0400)] 
rcuja: various fixes

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agorcuja: add basic test
Mathieu Desnoyers [Mon, 20 Aug 2012 16:02:54 +0000 (12:02 -0400)] 
rcuja: add basic test

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agorcuja: create shadow node for root
Mathieu Desnoyers [Sun, 19 Aug 2012 13:28:53 +0000 (09:28 -0400)] 
rcuja: create shadow node for root

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agorcuja: implement add
Mathieu Desnoyers [Sun, 19 Aug 2012 01:16:34 +0000 (21:16 -0400)] 
rcuja: implement add

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agorcuja: implement lookup
Mathieu Desnoyers [Mon, 13 Aug 2012 13:58:40 +0000 (09:58 -0400)] 
rcuja: implement lookup

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agorcuja: rename cds_ja_node to cds_ja_inode
Mathieu Desnoyers [Mon, 13 Aug 2012 13:16:13 +0000 (09:16 -0400)] 
rcuja: rename cds_ja_node to cds_ja_inode

inode for internal node.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agorcuja: new and destroy
Mathieu Desnoyers [Mon, 13 Aug 2012 12:26:55 +0000 (08:26 -0400)] 
rcuja: new and destroy

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agorcuja: rename to cds_ja
Mathieu Desnoyers [Mon, 13 Aug 2012 12:09:00 +0000 (08:09 -0400)] 
rcuja: rename to cds_ja

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agorcuja: add comment about use of number of nodes
Mathieu Desnoyers [Mon, 13 Aug 2012 00:55:31 +0000 (20:55 -0400)] 
rcuja: add comment about use of number of nodes

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agorcuja: fix iteration on recompact add
Mathieu Desnoyers [Mon, 13 Aug 2012 00:39:33 +0000 (20:39 -0400)] 
rcuja: fix iteration on recompact add

We must iterate on all entries by position, not by value.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agorcuja: share lock across all nodes with same key
Mathieu Desnoyers [Sun, 12 Aug 2012 23:52:58 +0000 (19:52 -0400)] 
rcuja: share lock across all nodes with same key

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agorcuja: no need to link with urcu lib anymore
Mathieu Desnoyers [Sun, 12 Aug 2012 20:50:02 +0000 (16:50 -0400)] 
rcuja: no need to link with urcu lib anymore

Now that we use the rcu flavor provided as parameter by the application,
there is no need to link with a urcu lib flavor.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agorcuja: use rcu ja app flavor for shadow hash table
Mathieu Desnoyers [Sun, 12 Aug 2012 20:26:53 +0000 (16:26 -0400)] 
rcuja: use rcu ja app flavor for shadow hash table

Since we use call_rcu to delay reclaim of the rcu ja node too, we need
to use the same RCU flavor as the application that calls the RCU JA API.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agorcuja: shadow clear also frees the rcu ja node associated
Mathieu Desnoyers [Sun, 12 Aug 2012 20:13:27 +0000 (16:13 -0400)] 
rcuja: shadow clear also frees the rcu ja node associated

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agorcuja: implement shadow node hash table
Mathieu Desnoyers [Sun, 12 Aug 2012 20:06:21 +0000 (16:06 -0400)] 
rcuja: implement shadow node hash table

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agorcuja: add shadow nodes hash table
Mathieu Desnoyers [Sun, 12 Aug 2012 19:08:00 +0000 (15:08 -0400)] 
rcuja: add shadow nodes hash table

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agorcuja: add data structures for rcu_ja and shadow nodes
Mathieu Desnoyers [Sun, 12 Aug 2012 18:38:52 +0000 (14:38 -0400)] 
rcuja: add data structures for rcu_ja and shadow nodes

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agorcuja: add missing assign in recompact
Mathieu Desnoyers [Sun, 12 Aug 2012 18:29:40 +0000 (14:29 -0400)] 
rcuja: add missing assign in recompact

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agorcuja: implement node add recompaction
Mathieu Desnoyers [Sun, 12 Aug 2012 18:04:40 +0000 (14:04 -0400)] 
rcuja: implement node add recompaction

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agorcuja: set node update, rcu-ize get node/set node
Mathieu Desnoyers [Sun, 12 Aug 2012 17:06:53 +0000 (13:06 -0400)] 
rcuja: set node update, rcu-ize get node/set node

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agorcuja: cleanup
Mathieu Desnoyers [Sun, 12 Aug 2012 16:51:14 +0000 (12:51 -0400)] 
rcuja: cleanup

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agorcuja: implement node set nth
Mathieu Desnoyers [Fri, 3 Aug 2012 03:07:23 +0000 (23:07 -0400)] 
rcuja: implement node set nth

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agorcuja: introduce union to represent nodes
Mathieu Desnoyers [Mon, 30 Jul 2012 03:47:36 +0000 (23:47 -0400)] 
rcuja: introduce union to represent nodes

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agorcuja testpop: print extra items in subclass instead of confusing "unbalance"
Mathieu Desnoyers [Sat, 16 Jun 2012 19:14:03 +0000 (15:14 -0400)] 
rcuja testpop: print extra items in subclass instead of confusing "unbalance"

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agoUse statistical approach to approximate the max number of nodes per population
Mathieu Desnoyers [Sat, 16 Jun 2012 18:12:55 +0000 (14:12 -0400)] 
Use statistical approach to approximate the max number of nodes per population

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agorcuja: Update design document, discuss pool distributions
Mathieu Desnoyers [Tue, 20 Mar 2012 13:15:47 +0000 (09:15 -0400)] 
rcuja: Update design document, discuss pool distributions

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agorcuja: use pool of linear array instead of bitmap
Mathieu Desnoyers [Fri, 9 Mar 2012 23:14:21 +0000 (18:14 -0500)] 
rcuja: use pool of linear array instead of bitmap

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agoRCU judy array: implement node get functions
Mathieu Desnoyers [Fri, 9 Mar 2012 18:09:18 +0000 (13:09 -0500)] 
RCU judy array: implement node get functions

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agorcuja: Increase granularity
Mathieu Desnoyers [Fri, 9 Mar 2012 03:23:33 +0000 (22:23 -0500)] 
rcuja: Increase granularity

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agoRCU Judy Array Design and initial files
Mathieu Desnoyers [Fri, 9 Mar 2012 01:16:27 +0000 (20:16 -0500)] 
RCU Judy Array Design and initial files

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agoFix: membarrier fallback symbol conflict
Mathieu Desnoyers [Wed, 8 May 2013 13:53:45 +0000 (09:53 -0400)] 
Fix: membarrier fallback symbol conflict

* Lai Jiangshan (laijs@cn.fujitsu.com) wrote:
> Hi, Mathieu,
>
> There is a big compatible problem in URCU which should be fix in next round.
>
> LB: liburcu built on the system which has sys_membarrier().
> LU: liburcu built on the system which does NOT have sys_membarrier().
>
> LBM: liburcu-mb ....
> LUM: liburcu-mb ...
>
> AB: application(-lliburcu) built on the system which has sys_membarrier().
> AU: application(-lliburcu) built on the system which does NOT have
> sys_membarrier().
>
> ABM application(-lliburcu-mb) ...
> AUM application(-lliburcu-mb) ...
>
> AB/AU + LB/LU: 4 combinations
> ABM/AUM + LBM/LUM: 4 combinations
>
> I remember some of the 8 combinations can't works due to symbols are
> miss match.  only LU+AB and LB+AU ?
>
> could you check it?
>
> How to fix it: In LU and AU, keep all the symbol name/ABI as LA and
> AB, but only the behaviors falls back to URCU_MB.

Define membarrier() as -ENOSYS when SYS_membarrier is not found in the
system headers. Check dynamically for membarrier availability to ensure
ABI compatibility between applications and librairies.

Reported-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agoFix: Use a filled signal mask to disable all signals
Mathieu Desnoyers [Fri, 10 May 2013 11:30:18 +0000 (07:30 -0400)] 
Fix: Use a filled signal mask to disable all signals

Changelog from David Pelton's original patch:

While using lttng-ust with an application that was calling fork()
with pending signals, I found that all signals were getting unmasked
shortly before the underlying call to fork().  After some
investigation, I found that the rcu_bp_before_fork() function was
unmasking all signals.  Based on the comments for this function, it
should be masking all signals.  Inspection of the rest of the code
in urcu-bp.c revealed the same pattern in two other functions.

This patch changes the code to use a filled signal mask to disable
all signals.  The change to rcu_bp_before_fork() addressed the
problem I was seeing while using lttng-ust.  The changes to the
other two functions appear to fix other instances of the same
problem.

Updates by Mathieu Desnoyers:

- Use SIG_BLOCK instead of SIG_SETMASK when setting a filled mask. This
  has the same behavior in this case (since we're blocking all signals),
  but is semantically neater: if we ever some signals from that mask,
  we'd like to to a union with the signal mask already blocked by the
  application.
- Also fix incorrect signal masking in compat_arch_x86.c.

Reported-by: David Pelton <dpelton@ciena.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agourcu-bp: introduce struct urcu_gp
Mathieu Desnoyers [Mon, 6 May 2013 14:30:57 +0000 (10:30 -0400)] 
urcu-bp: introduce struct urcu_gp

Make urcu-bp similar to urcu-qsbr and other urcu flavors.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agoFix: struct urcu_gp broke multiflavor
Mathieu Desnoyers [Mon, 6 May 2013 14:24:14 +0000 (10:24 -0400)] 
Fix: struct urcu_gp broke multiflavor

Add mapping to namespace urcu_gp.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agoCleanup test usage printout
Mathieu Desnoyers [Mon, 6 May 2013 14:03:55 +0000 (10:03 -0400)] 
Cleanup test usage printout

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agowfstack tests: use pop "last" state info
Mathieu Desnoyers [Mon, 6 May 2013 13:35:42 +0000 (09:35 -0400)] 
wfstack tests: use pop "last" state info

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agowfstack: return whether pop is popping the last element
Mathieu Desnoyers [Mon, 6 May 2013 13:35:07 +0000 (09:35 -0400)] 
wfstack: return whether pop is popping the last element

Newly introduced "with_state" pop API members return stack state
atomically sampled with the pop operation.

Allow testing behavior of pop with respect to number of push-to-empty
and pop-all-from-non-empty.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agowfcqueue tests: use dequeue empty state
Mathieu Desnoyers [Mon, 6 May 2013 13:34:00 +0000 (09:34 -0400)] 
wfcqueue tests: use dequeue empty state

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agowfcqueue: return whether dequeue is dequeuing last element
Mathieu Desnoyers [Mon, 6 May 2013 13:33:36 +0000 (09:33 -0400)] 
wfcqueue: return whether dequeue is dequeuing last element

Newly introduced "with_state" dequeue API members return queue state
atomically sampled with the dequeue operation.

Allow testing behavior of dequeue with respect to number of
enqueue-to-empty and splice-from-non-empty.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agourcu: avoid false sharing for rcu_gp_ctr
Lai Jiangshan [Mon, 6 May 2013 12:42:27 +0000 (08:42 -0400)] 
urcu: avoid false sharing for rcu_gp_ctr

@rcu_gp_ctr and @registry share the same cache line, it causes
false sharing and slowdown both of the read site and update site.

Fix: Use different cache line for them.

Although rcu_gp_futex is updated less than rcu_gp_ctr, but
they always be accessed at almost the same time, so we also move rcu_gp_futex
to the cacheline of rcu_gp_ctr to reduce the cacheline-usage or cache-missing
of read site.

test: (4X6=24 CPUs)

Before patch:

[root@localhost userspace-rcu]# ./tests/test_urcu_mb 20 1 20
SUMMARY ./tests/test_urcu_mb      testdur   20 nr_readers  20 rdur      0 wdur      0 nr_writers   1 wdelay      0 nr_reads   2100285330 nr_writes      3390219 nr_ops   2103675549
[root@localhost userspace-rcu]# ./tests/test_urcu_mb 20 1 20
SUMMARY ./tests/test_urcu_mb      testdur   20 nr_readers  20 rdur      0 wdur      0 nr_writers   1 wdelay      0 nr_reads   1619868562 nr_writes      3529478 nr_ops   1623398040
[root@localhost userspace-rcu]# ./tests/test_urcu_mb 20 1 20
SUMMARY ./tests/test_urcu_mb      testdur   20 nr_readers  20 rdur      0 wdur      0 nr_writers   1 wdelay      0 nr_reads   1949067038 nr_writes      3469334 nr_ops   1952536372

after patch:

[root@localhost userspace-rcu]# ./tests/test_urcu_mb 20 1 20
SUMMARY ./tests/test_urcu_mb      testdur   20 nr_readers  20 rdur      0 wdur      0 nr_writers   1 wdelay      0 nr_reads   3380191848 nr_writes      4903248 nr_ops   3385095096
[root@localhost userspace-rcu]# ./tests/test_urcu_mb 20 1 20
SUMMARY ./tests/test_urcu_mb      testdur   20 nr_readers  20 rdur      0 wdur      0 nr_writers   1 wdelay      0 nr_reads   3397637486 nr_writes      4129809 nr_ops   3401767295

Singed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agourcu: make the code of urcu-qsbr as normal urcu
Lai Jiangshan [Mon, 6 May 2013 12:32:02 +0000 (08:32 -0400)] 
urcu: make the code of urcu-qsbr as normal urcu

urcu-qsbr's read site's quiescence is much longer than normal urcu ==>
synchronize_rcu() is much slower ==>
rcu_gp_ctr is updated much less ==>
the whole urcu-qsbr will not be slowed down by false sharing of rcu_gp_ctr.

But this patch makes sense to keep the code of urcu-qsbr like normal urcu,
better readability and maintenance.

Test: (4*6 CPUs)
Before patch:
[root@localhost userspace-rcu]# ./tests/test_urcu_qsbr 20 1 20
SUMMARY ./tests/test_urcu_qsbr    testdur   20 nr_readers  20 rdur      0 wdur      0 nr_writers   1 wdelay      0 nr_reads  65498297587 nr_writes      2000665 nr_ops  65500298252
[root@localhost userspace-rcu]# ./tests/test_urcu_qsbr 20 1 20
SUMMARY ./tests/test_urcu_qsbr    testdur   20 nr_readers  20 rdur      0 wdur      0 nr_writers   1 wdelay      0 nr_reads  67218079467 nr_writes      1981593 nr_ops  67220061060

After patch
./tests/test_urcu_qsbr 20 1 20
SUMMARY ./tests/test_urcu_qsbr    testdur   20 nr_readers  20 rdur      0 wdur      0 nr_writers   1 wdelay      0 nr_reads  67473798999 nr_writes      1999151 nr_ops  67475798150
[root@localhost userspace-rcu]# ./tests/test_urcu_qsbr 20 1 20
SUMMARY ./tests/test_urcu_qsbr    testdur   20 nr_readers  20 rdur      0 wdur      0 nr_writers   1 wdelay      0 nr_reads  67065521397 nr_writes      1993956 nr_ops  67067515353

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agorculfhash: detect if resize/destroy are called within RCU read-side C.S.
Mathieu Desnoyers [Tue, 30 Apr 2013 01:30:17 +0000 (21:30 -0400)] 
rculfhash: detect if resize/destroy are called within RCU read-side C.S.

Report errors.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agoDocumentation: rculfhash: cds_lfht_resize not within read-side C.S.
Mathieu Desnoyers [Tue, 30 Apr 2013 00:48:40 +0000 (20:48 -0400)] 
Documentation: rculfhash: cds_lfht_resize not within read-side C.S.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agofix: rculfhash don't change qsbr online state
Mathieu Desnoyers [Tue, 30 Apr 2013 00:28:10 +0000 (20:28 -0400)] 
fix: rculfhash don't change qsbr online state

resize and destroy should not change the QSBR online state. Use the new
rcu_read_ongoing() API for this.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agoAdd rcu_read_ongoing() API to each urcu flavor
Mathieu Desnoyers [Tue, 30 Apr 2013 00:17:22 +0000 (20:17 -0400)] 
Add rcu_read_ongoing() API to each urcu flavor

This will allow checking whether:

- thread is online (QSBR),
- thread is nested within read-side critical section (other flavors),

This is useful for libraries that need to know if QSBR is online in
order to save the original state temporarily so it can be restored
before returning to the caller.

Eventually, this API can be called by a "debugging" implementation of
rcu_dereference() and other urcu-pointer.h API members to check that no
RCU pointer is read outside of RCU read-side critical sections.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agoAdd "sparc" host cpu to configure.ac
Mathieu Desnoyers [Wed, 17 Apr 2013 21:26:05 +0000 (17:26 -0400)] 
Add "sparc" host cpu to configure.ac

Some sparc Debian setups advertise a "sparc" host cpu (rather than
sparc64).

In all cases, I think it should be safe to add a "sparc" entry to
userspace RCU configure.ac upstream, e.g.

        [sparc], [ARCHTYPE="sparc64"],

in the event someone would launch the build on an environment not
supporting sparc v9, the build would fail because the 32-bit compiler
would not be able to generate sparc v9 instructions (unless
explicitely instructed to do so by the -m32 -Wa,-Av9a flags).

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agofutex: include syscall.h instead of sys/syscall.h
Mathieu Desnoyers [Tue, 16 Apr 2013 17:09:02 +0000 (13:09 -0400)] 
futex: include syscall.h instead of sys/syscall.h

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agoAdd tab to output in order to allow easy nesting of tables.
Paul E. McKenney [Thu, 14 Mar 2013 15:22:23 +0000 (11:22 -0400)] 
Add tab to output in order to allow easy nesting of tables.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agoRemove urcu-api-list.sh from dist tarball
Mathieu Desnoyers [Thu, 14 Mar 2013 14:32:13 +0000 (10:32 -0400)] 
Remove urcu-api-list.sh from dist tarball

It needs to be run in the git repository.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agoAdd urcu-api-list.sh script
Paul E. McKenney [Thu, 14 Mar 2013 14:11:53 +0000 (10:11 -0400)] 
Add urcu-api-list.sh script

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agolist: implement cds_list_for_each_safe()
Mathieu Desnoyers [Wed, 13 Mar 2013 16:23:11 +0000 (12:23 -0400)] 
list: implement cds_list_for_each_safe()

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agoFix: tests/api.h use cpuset.h
Mathieu Desnoyers [Fri, 22 Feb 2013 16:34:25 +0000 (11:34 -0500)] 
Fix: tests/api.h use cpuset.h

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agoFix hurd-i386: move cpuset tests outside of sched_setaffinity conditional
Mathieu Desnoyers [Fri, 22 Feb 2013 15:57:48 +0000 (10:57 -0500)] 
Fix hurd-i386: move cpuset tests outside of sched_setaffinity conditional

Comment about introduction of cpuset.h within urcu tests:

> Unfortunately it doesn't work, because sched_setaffinity is for now
> just a fail-stub on hurd-i386, and thus configure considers it as
> missing, and thus the CPU_SET test is disabled completely.
>
> I however guess you could just disable defining your own cpu_set_t
> when !HAVE_SCHED_SETAFFINITY, since it is probably used only for using
> sched_setaffinity.

Fix by moving cpu_set_t, CPU_SET and CPU_ZERO tests outside of the
sched_setaffinity conditional.

Reported-by: Samuel Thibault <sthibault@debian.org>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agoFix tests: finer-grained use of CPU_SET, CPU_ZERO and cpu_set_t
Mathieu Desnoyers [Fri, 22 Feb 2013 14:05:32 +0000 (09:05 -0500)] 
Fix tests: finer-grained use of CPU_SET, CPU_ZERO and cpu_set_t

Noticed build failure at
https://buildd.debian.org/status/package.php?p=liburcu :

Tail of log for liburcu on hurd-i386:

test_urcu.c:110:0: warning: "CPU_SET" redefined [enabled by default]
In file included from /usr/include/pthread/pthread.h:50:0,
                 from /usr/include/pthread.h:2,
                 from test_urcu.c:26:
/usr/include/sched.h:80:0: note: this is the location of the previous definition
make[3]: *** [test_urcu.o] Error 1
make[2]: *** [all-recursive] Error 1
make[1]: *** [all] Error 2
dh_auto_build: make -j1 returned exit code 2
make: *** [build-arch] Error 2
dpkg-buildpackage: error: debian/rules build-arch gave error exit status 2
make[3]: Entering directory `/build/buildd-liburcu_0.7.6-1-hurd-i386-wGBAtt/liburcu-0.7.6/tests'
  CC     test_urcu.o
make[3]: Leaving directory `/build/buildd-liburcu_0.7.6-1-hurd-i386-wGBAtt/liburcu-0.7.6/tests'
make[2]: Leaving directory `/build/buildd-liburcu_0.7.6-1-hurd-i386-wGBAtt/liburcu-0.7.6'

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agoTest for CPU_SET
Mathieu Desnoyers [Fri, 22 Feb 2013 13:50:49 +0000 (08:50 -0500)] 
Test for CPU_SET

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agoFix build on architectures with HAVE_SCHED_GETCPU but without HAVE_SYSCONF
Mathieu Desnoyers [Fri, 22 Feb 2013 13:35:37 +0000 (08:35 -0500)] 
Fix build on architectures with HAVE_SCHED_GETCPU but without HAVE_SYSCONF

Noticed on: https://buildd.debian.org/status/package.php?p=liburcu

Tail of log for liburcu on kfreebsd-amd64:

  CC     urcu.lo
In file included from urcu.c:450:0:
urcu-call-rcu-impl.h:145:12: error: static declaration of 'sched_getcpu' follows non-static declaration
In file included from /usr/include/sched.h:43:0,
                 from /usr/include/pthread.h:20,
                 from urcu.c:30:
/usr/include/x86_64-kfreebsd-gnu/bits/sched.h:65:12: note: previous declaration of 'sched_getcpu' was here
make[3]: *** [urcu.lo] Error 1
make[3]: Leaving directory `/build/buildd-liburcu_0.7.6-1-kfreebsd-amd64-nnkICd/liburcu-0.7.6'
make[2]: *** [all-recursive] Error 1
make[2]: Leaving directory `/build/buildd-liburcu_0.7.6-1-kfreebsd-amd64-nnkICd/liburcu-0.7.6'
make[1]: *** [all] Error 2
make[1]: Leaving directory `/build/buildd-liburcu_0.7.6-1-kfreebsd-amd64-nnkICd/liburcu-0.7.6'
dh_auto_build: make -j1 returned exit code 2
make: *** [build-arch] Error 2

Tail of log for liburcu on kfreebsd-i386:

  CC     urcu.lo
In file included from urcu.c:450:0:
urcu-call-rcu-impl.h:145:12: error: static declaration of 'sched_getcpu' follows non-static declaration
In file included from /usr/include/sched.h:43:0,
                 from /usr/include/pthread.h:20,
                 from urcu.c:30:
/usr/include/i386-kfreebsd-gnu/bits/sched.h:65:12: note: previous declaration of 'sched_getcpu' was here
make[3]: *** [urcu.lo] Error 1
make[3]: Leaving directory `/build/buildd-liburcu_0.7.6-1-kfreebsd-i386-sWzNKU/liburcu-0.7.6'
make[2]: *** [all-recursive] Error 1
make[2]: Leaving directory `/build/buildd-liburcu_0.7.6-1-kfreebsd-i386-sWzNKU/liburcu-0.7.6'
make[1]: *** [all] Error 2
make[1]: Leaving directory `/build/buildd-liburcu_0.7.6-1-kfreebsd-i386-sWzNKU/liburcu-0.7.6'
dh_auto_build: make -j1 returned exit code 2
make: *** [build-arch] Error 2

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agoREADME: document that Clang 3.0 (based on LLVM 3.0) is supported
Mathieu Desnoyers [Fri, 22 Feb 2013 13:04:29 +0000 (08:04 -0500)] 
README: document that Clang 3.0 (based on LLVM 3.0) is supported

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agoclang: silence "unused expression result" warning
Mathieu Desnoyers [Fri, 22 Feb 2013 12:57:16 +0000 (07:57 -0500)] 
clang: silence "unused expression result" warning

CMM_STORE_SHARED(x, v) is a macro that really acts like an assignment
expression, e.g.:

  x = v;

but internally also has "mc" barriers (useful for cache-incoherent
architectures).

The issue here is that (x = v) can evaluate to "v", but very often we're
not interested to use the assignment expression result. When we have an
explicit assignment, the compiler won't complain that the result of this
expression is unused, but given that the added barrier requires that we
make this macro evaluate explicitly to a value, clang complains.

Fix this by adding "_v = _v" at the last line of the macro, thus
performing what would appear like an effect-less assignment, but
actually tricks clang into thinking we are evaluating to an assignment
expression, thus suppressing the warning.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agorculfhash: add assertions on node alignment
Mathieu Desnoyers [Thu, 14 Feb 2013 16:36:43 +0000 (11:36 -0500)] 
rculfhash: add assertions on node alignment

I've had a report of someone running into issues with the RCU lock-free
hash table by embedding the struct cds_lfht_node into a packed structure
by mistake, thus not respecting alignment requirements stated in
urcu/rculfhash.h. Assertions on "replace" and "add" operations should
catch this, but I notice that we should add assertions on the
REMOVAL_OWNER_FLAG to cover all possible misalignments.

Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agoSpelling cleanups within comments and documentation
Etienne Bergeron [Wed, 13 Feb 2013 02:33:16 +0000 (21:33 -0500)] 
Spelling cleanups within comments and documentation

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agoFix configure checks for Tile
Simon Marchi [Tue, 12 Feb 2013 00:10:44 +0000 (19:10 -0500)] 
Fix configure checks for Tile

The previous method of checking whether the architecture is TileGx or
not was buggy. urcu/arch/tile.h included urcu/arch/gcc.h, which was not
installed on the system, causing a configure error. I am not sure why it
worked when I tested commit 1000f1f4204e5fbb337f4ea911f1e29f67df79aa,
maybe some previous partial install or something.

The check is now done earlier, during the configure step and should not
cause any trouble.

Signed-off-by: Simon Marchi <simon.marchi@polymtl.ca>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agouatomic: style fix
Mathieu Desnoyers [Thu, 31 Jan 2013 16:31:39 +0000 (11:31 -0500)] 
uatomic: style fix

- Functions that don't take arguments should be "void" in C, otherwise
  those functions can take a variable number of arguments.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agodoc/cds-api.txt: expand documentation
Mathieu Desnoyers [Sat, 26 Jan 2013 15:51:48 +0000 (10:51 -0500)] 
doc/cds-api.txt: expand documentation

Expand explanations, reorder items to have all wait-free descriptions
first, so that the rculfqueue API comes last, since it is less
featureful and is the only API of the queues/stacks to actually rely on
RCU.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agoREADME: document each API file
Mathieu Desnoyers [Sat, 26 Jan 2013 15:51:31 +0000 (10:51 -0500)] 
README: document each API file

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agoREADME: reorganize
Mathieu Desnoyers [Sat, 26 Jan 2013 15:48:28 +0000 (10:48 -0500)] 
README: reorganize

Move debug build options, and smp support description, to end of README

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agoAdd compilation support for the TileGX architecture
Simon Marchi [Thu, 24 Jan 2013 20:40:54 +0000 (15:40 -0500)] 
Add compilation support for the TileGX architecture

This patch adds compilation support for the TileGx architecture. Since
the tests were not ran on other architectures of the Tile family
(Tile64, TIlepro), errors are triggered during compilation if the
architecture is another Tile arch.

Signed-off-by: Simon Marchi <simon.marchi@polymtl.ca>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agowfstack: add nonblocking to _LGPL_SOURCE API
Mathieu Desnoyers [Sun, 20 Jan 2013 21:59:36 +0000 (16:59 -0500)] 
wfstack: add nonblocking to _LGPL_SOURCE API

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agoDiscourage use of pthread_atfork() for call_rcu handlers
Mathieu Desnoyers [Wed, 26 Dec 2012 17:18:06 +0000 (12:18 -0500)] 
Discourage use of pthread_atfork() for call_rcu handlers

Discourage use of glibc pthread_atfork() for call_rcu handlers due to
its inappropriate assumptions about single-threadedness while pthread
atfork handlers are executing. This results in hangs within the glibc
memory allocator.

Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agoFix call_rcu fork handling
Mathieu Desnoyers [Wed, 19 Dec 2012 00:31:21 +0000 (19:31 -0500)] 
Fix call_rcu fork handling

Fix call_rcu fork handling by putting all call_rcu threads in a
quiescent state before fork (paused state), and unpausing them when the
parent returns from fork.

On the child, everything will run fine as long as we don't issue fork()
from a call_rcu callback.

Side-note: pthread_atfork is not appropriate when using with multithread
and malloc/free. The glibc malloc implementation sadly expects that all
malloc/free are executed from the context of a single thread while
pthread atfork handlers are running, which leads to interesting hang in
glibc.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agotest: fork handling
Mathieu Desnoyers [Tue, 18 Dec 2012 04:43:14 +0000 (23:43 -0500)] 
test: fork handling

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agorculfhash: add cds_lfht_replace to the write operations in the comments
Lai Jiangshan [Thu, 20 Dec 2012 11:13:57 +0000 (06:13 -0500)] 
rculfhash: add cds_lfht_replace to the write operations in the comments

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agourcu: fix comments for cds_list_for_each_prev()
Lai Jiangshan [Thu, 20 Dec 2012 11:13:09 +0000 (06:13 -0500)] 
urcu: fix comments for cds_list_for_each_prev()

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agodocumentation: fix rcu-api.txt duplicates
Mathieu Desnoyers [Mon, 10 Dec 2012 22:24:33 +0000 (17:24 -0500)] 
documentation: fix rcu-api.txt duplicates

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agotest wfcq: remove unneeded urcu.h include
Mathieu Desnoyers [Sat, 8 Dec 2012 15:16:10 +0000 (10:16 -0500)] 
test wfcq: remove unneeded urcu.h include

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agotest wfs: remove unneeded urcu.h include
Mathieu Desnoyers [Sat, 8 Dec 2012 15:15:49 +0000 (10:15 -0500)] 
test wfs: remove unneeded urcu.h include

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agourcu: declare test_urcu_multiflavor functions
Lai Jiangshan [Fri, 7 Dec 2012 16:37:21 +0000 (11:37 -0500)] 
urcu: declare test_urcu_multiflavor functions

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agourcu: remove the wrong comma
Lai Jiangshan [Fri, 7 Dec 2012 16:33:38 +0000 (11:33 -0500)] 
urcu: remove the wrong comma

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agowfstack: implement nonblocking pop and next
Mathieu Desnoyers [Wed, 5 Dec 2012 14:41:08 +0000 (09:41 -0500)] 
wfstack: implement nonblocking pop and next

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agowfcqueue: document first/next return values
Mathieu Desnoyers [Thu, 6 Dec 2012 21:02:30 +0000 (16:02 -0500)] 
wfcqueue: document first/next return values

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agowfstack: update comments about cds_wfs_empty/first being wait-free
Mathieu Desnoyers [Wed, 5 Dec 2012 14:20:52 +0000 (09:20 -0500)] 
wfstack: update comments about cds_wfs_empty/first being wait-free

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agowfstack API: rename cds_wfs_first_blocking to cds_wfs_first
Mathieu Desnoyers [Wed, 5 Dec 2012 14:01:21 +0000 (09:01 -0500)] 
wfstack API: rename cds_wfs_first_blocking to cds_wfs_first

cds_wfs_first never needs to block. This operation can be used to check
if the stack returned by pop_all is empty or not, so it is quite
interesting to have a fully non-blocking semantic for all of
enqueue/pop_all/first operations. Only cds_wfs_next may block.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agowfstack test: test if number of push to empty vs pop_all match
Mathieu Desnoyers [Wed, 5 Dec 2012 13:57:44 +0000 (08:57 -0500)] 
wfstack test: test if number of push to empty vs pop_all match

Do same as wfcqueue: we can test if number of push to empty stack match
the number of pop_all that return non-empty stack.

Can be tested with:
./test_urcu_wfs 5 5 10 -w

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agowfstack: document first/next return values
Mathieu Desnoyers [Wed, 5 Dec 2012 13:53:08 +0000 (08:53 -0500)] 
wfstack: document first/next return values

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agotest wfstack: enforce external mutex if needed by default
Mathieu Desnoyers [Wed, 5 Dec 2012 11:13:08 +0000 (06:13 -0500)] 
test wfstack: enforce external mutex if needed by default

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agotest wfcqueue: enforce external mutex if needed by default
Mathieu Desnoyers [Wed, 5 Dec 2012 11:12:42 +0000 (06:12 -0500)] 
test wfcqueue: enforce external mutex if needed by default

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agourcu-mb/signal/membarrier: batch concurrent synchronize_rcu()
Mathieu Desnoyers [Mon, 26 Nov 2012 03:02:18 +0000 (22:02 -0500)] 
urcu-mb/signal/membarrier: batch concurrent synchronize_rcu()

Here are benchmarks on batching of synchronize_rcu(), and it leads to
very interesting scalability improvement and speedups, e.g., on a
24-core AMD, with a write-heavy scenario (4 readers threads, 20 updater
threads, each updater using synchronize_rcu()):

* Serialized grace periods:
./test_urcu 4 20 20
SUMMARY ./test_urcu               testdur   20 nr_readers   4
rdur       0 wdur      0 nr_writers  20 wdelay      0
nr_reads    714598368 nr_writes      5032889 nr_ops    719631257

* Batched grace periods:

./test_urcu 4 20 20
SUMMARY ./test_urcu               testdur   20 nr_readers   4
rdur       0 wdur      0 nr_writers  20 wdelay      0
nr_reads    611848168 nr_writes      9877965 nr_ops    621726133

For a 9877965/5032889 = 1.96 speedup for 20 updaters.

Of course, we can see that readers have slowed down, probably due to
increased update traffic, given there is no change to the read-side code
whatsoever.

Now let's see the penality of managing the stack for single-updater.
With 4 readers, single updater:

* Serialized grace periods :

./test_urcu 4 1 20
SUMMARY ./test_urcu               testdur   20 nr_readers   4
rdur       0 wdur      0 nr_writers   1 wdelay      0
nr_reads    241959144 nr_writes     11146189 nr_ops    253105333
SUMMARY ./test_urcu               testdur   20 nr_readers   4
rdur       0 wdur      0 nr_writers   1 wdelay      0
nr_reads    257131080 nr_writes     12310537 nr_ops    269441617
SUMMARY ./test_urcu               testdur   20 nr_readers   4
rdur       0 wdur      0 nr_writers   1 wdelay      0
nr_reads    259973359 nr_writes     12203025 nr_ops    272176384

* Batched grace periods :

SUMMARY ./test_urcu               testdur   20 nr_readers   4
rdur       0 wdur      0 nr_writers   1 wdelay      0
nr_reads    298926555 nr_writes     14018748 nr_ops    312945303
SUMMARY ./test_urcu               testdur   20 nr_readers   4
rdur       0 wdur      0 nr_writers   1 wdelay      0
nr_reads    272411290 nr_writes     12832166 nr_ops    285243456
SUMMARY ./test_urcu               testdur   20 nr_readers   4
rdur       0 wdur      0 nr_writers   1 wdelay      0
nr_reads    267511858 nr_writes     12822026 nr_ops    280333884

Serialized vs batched seems to similar, batched possibly even slightly
faster, but this is probably caused by NUMA affinity.

More benchmark results:

* Serialized synchronize_rcu() -- test_urcu (mb)

./test_urcu 4 1 20
SUMMARY ./test_urcu               testdur   20 nr_readers   4 rdur      0 wdur      0 nr_writers   1 wdelay      0 nr_reads    222512859 nr_writes     10723654 nr_ops    233236513
./test_urcu 4 20 20
SUMMARY ./test_urcu               testdur   20 nr_readers   4 rdur      0 wdur      0 nr_writers  20 wdelay      0 nr_reads    722096653 nr_writes      5012429 nr_ops    727109082
./test_urcu 12 12 20
SUMMARY ./test_urcu               testdur   20 nr_readers  12 rdur      0 wdur      0 nr_writers  12 wdelay      0 nr_reads   1822868768 nr_writes      2300787 nr_ops   1825169555
./test_urcu 16 8 20
SUMMARY ./test_urcu               testdur   20 nr_readers  16 rdur      0 wdur      0 nr_writers   8 wdelay      0 nr_reads   2355908375 nr_writes      1604850 nr_ops   2357513225
./test_urcu 20 4 20
SUMMARY ./test_urcu               testdur   20 nr_readers  20 rdur      0 wdur      0 nr_writers   4 wdelay      0 nr_reads   3003457459 nr_writes      1074828 nr_ops   3004532287
./test_urcu 20 3 20
SUMMARY ./test_urcu               testdur   20 nr_readers  20 rdur      0 wdur      0 nr_writers   3 wdelay      0 nr_reads   2956972543 nr_writes      1036556 nr_ops   2958009099
./test_urcu 20 2 20
SUMMARY ./test_urcu               testdur   20 nr_readers  20 rdur      0 wdur      0 nr_writers   2 wdelay      0 nr_reads   2890178860 nr_writes      1030095 nr_ops   2891208955
./test_urcu 20 1 20
SUMMARY ./test_urcu               testdur   20 nr_readers  20 rdur      0 wdur      0 nr_writers   1 wdelay      0 nr_reads   3017482290 nr_writes       783420 nr_ops   3018265710

* Batched synchronize_rcu() -- test_urcu (mb)

./test_urcu 4 1 20
SUMMARY ./test_urcu               testdur   20 nr_readers   4 rdur      0 wdur      0 nr_writers   1 wdelay      0 nr_reads    271476751 nr_writes     12858885 nr_ops    284335636
./test_urcu 4 20 20
SUMMARY ./test_urcu               testdur   20 nr_readers   4 rdur      0 wdur      0 nr_writers  20 wdelay      0 nr_reads    608488583 nr_writes     10080610 nr_ops    618569193
./test_urcu 12 12 20
SUMMARY ./test_urcu               testdur   20 nr_readers  12 rdur      0 wdur      0 nr_writers  12 wdelay      0 nr_reads   1260044362 nr_writes      7957711 nr_ops   1268002073
./test_urcu 16 8 20
SUMMARY ./test_urcu               testdur   20 nr_readers  16 rdur      0 wdur      0 nr_writers   8 wdelay      0 nr_reads   2048890674 nr_writes      5440985 nr_ops   2054331659
./test_urcu 20 4 20
SUMMARY ./test_urcu               testdur   20 nr_readers  20 rdur      0 wdur      0 nr_writers   4 wdelay      0 nr_reads   2819267217 nr_writes      3093008 nr_ops   2822360225
./test_urcu 20 3 20
SUMMARY ./test_urcu               testdur   20 nr_readers  20 rdur      0 wdur      0 nr_writers   3 wdelay      0 nr_reads   3067795320 nr_writes      2817760 nr_ops   3070613080
./test_urcu 20 2 20
SUMMARY ./test_urcu               testdur   20 nr_readers  20 rdur      0 wdur      0 nr_writers   2 wdelay      0 nr_reads   3116770603 nr_writes      2404242 nr_ops   3119174845
./test_urcu 20 1 20
SUMMARY ./test_urcu               testdur   20 nr_readers  20 rdur      0 wdur      0 nr_writers   1 wdelay      0 nr_reads   2238534130 nr_writes      3737588 nr_ops   2242271718

* Serialized synchronize_rcu() -- test_urcu_signal

./test_urcu_signal 4 1 20
SUMMARY ./test_urcu_signal        testdur   20 nr_readers   4 rdur      0 wdur      0 nr_writers   1 wdelay      0 nr_reads  16063309841 nr_writes         9217 nr_ops  16063319058
./test_urcu_signal 4 20 20
SUMMARY ./test_urcu_signal        testdur   20 nr_readers   4 rdur      0 wdur      0 nr_writers  20 wdelay      0 nr_reads  16065183739 nr_writes         9182 nr_ops  16065192921
./test_urcu_signal 12 12 20
SUMMARY ./test_urcu_signal        testdur   20 nr_readers  12 rdur      0 wdur      0 nr_writers  12 wdelay      0 nr_reads  48028512672 nr_writes         8890 nr_ops  48028521562
./test_urcu_signal 16 8 20
SUMMARY ./test_urcu_signal        testdur   20 nr_readers  16 rdur      0 wdur      0 nr_writers   8 wdelay      0 nr_reads  64001589198 nr_writes         8756 nr_ops  64001597954
./test_urcu_signal 20 4 20
SUMMARY ./test_urcu_signal        testdur   20 nr_readers  20 rdur      0 wdur      0 nr_writers   4 wdelay      0 nr_reads  79907434070 nr_writes         9068 nr_ops  79907443138
./test_urcu_signal 20 3 20
SUMMARY ./test_urcu_signal        testdur   20 nr_readers  20 rdur      0 wdur      0 nr_writers   3 wdelay      0 nr_reads  79987250839 nr_writes         8589 nr_ops  79987259428
./test_urcu_signal 20 2 20
SUMMARY ./test_urcu_signal        testdur   20 nr_readers  20 rdur      0 wdur      0 nr_writers   2 wdelay      0 nr_reads  79749947176 nr_writes         8596 nr_ops  79749955772
./test_urcu_signal 20 1 20
SUMMARY ./test_urcu_signal        testdur   20 nr_readers  20 rdur      0 wdur      0 nr_writers   1 wdelay      0 nr_reads  79751023090 nr_writes         8624 nr_ops  79751031714

* Batched synchronize_rcu() -- test_urcu_signal

./test_urcu_signal 4 1 20
SUMMARY ./test_urcu_signal        testdur   20 nr_readers   4 rdur      0 wdur      0 nr_writers   1 wdelay      0 nr_reads  15739087241 nr_writes         9218 nr_ops  15739096459
./test_urcu_signal 4 20 20
SUMMARY ./test_urcu_signal        testdur   20 nr_readers   4 rdur      0 wdur      0 nr_writers  20 wdelay      0 nr_reads  15662135806 nr_writes        94833 nr_ops  15662230639
./test_urcu_signal 12 12 20
SUMMARY ./test_urcu_signal        testdur   20 nr_readers  12 rdur      0 wdur      0 nr_writers  12 wdelay      0 nr_reads  46634363289 nr_writes        56903 nr_ops  46634420192
./test_urcu_signal 16 8 20
SUMMARY ./test_urcu_signal        testdur   20 nr_readers  16 rdur      0 wdur      0 nr_writers   8 wdelay      0 nr_reads  62263951759 nr_writes        39058 nr_ops  62263990817
./test_urcu_signal 20 4 20
SUMMARY ./test_urcu_signal        testdur   20 nr_readers  20 rdur      0 wdur      0 nr_writers   4 wdelay      0 nr_reads  77799768623 nr_writes        21065 nr_ops  77799789688
./test_urcu_signal 20 3 20
SUMMARY ./test_urcu_signal        testdur   20 nr_readers  20 rdur      0 wdur      0 nr_writers   3 wdelay      0 nr_reads  76408008440 nr_writes        17026 nr_ops  76408025466
./test_urcu_signal 20 2 20
SUMMARY ./test_urcu_signal        testdur   20 nr_readers  20 rdur      0 wdur      0 nr_writers   2 wdelay      0 nr_reads  77868927424 nr_writes        12630 nr_ops  77868940054
./test_urcu_signal 20 1 20
SUMMARY ./test_urcu_signal        testdur   20 nr_readers  20 rdur      0 wdur      0 nr_writers   1 wdelay      0 nr_reads  77293186844 nr_writes         8680 nr_ops  77293195524

CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
CC: Lai Jiangshan <laijs@cn.fujitsu.com>
CC: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agourcu-wait: move queue management code into urcu-wait.h
Mathieu Desnoyers [Mon, 19 Nov 2012 23:16:53 +0000 (18:16 -0500)] 
urcu-wait: move queue management code into urcu-wait.h

Note: urcu-wait.h is not yet exposed outside of userspace RCU.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agourcu-wait: move wait code into separate file
Mathieu Desnoyers [Sun, 18 Nov 2012 20:16:43 +0000 (15:16 -0500)] 
urcu-wait: move wait code into separate file

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agourcu-qsbr: batch concurrent synchronize_rcu()
Mathieu Desnoyers [Mon, 12 Nov 2012 17:40:12 +0000 (12:40 -0500)] 
urcu-qsbr: batch concurrent synchronize_rcu()

Here are benchmarks on batching of synchronize_rcu(), and it leads to
very interesting scalability improvement and speedups, e.g., on a
24-core AMD, with a write-heavy scenario (4 readers threads, 20 updater
threads, each updater using synchronize_rcu()):

* Serialized grace periods :

./test_urcu_qsbr 4 20 20
SUMMARY ./test_urcu_qsbr          testdur   20 nr_readers   4
rdur      0 wdur      0 nr_writers  20 wdelay      0
nr_reads  20251412728 nr_writes      1826331 nr_ops  20253239059

* Batched grace periods :

./test_urcu_qsbr 4 20 20
SUMMARY ./test_urcu_qsbr          testdur   20 nr_readers   4
rdur      0 wdur      0 nr_writers  20 wdelay      0
nr_reads  15141994746 nr_writes      9382515 nr_ops  15151377261

For a 9382515/1826331 = 5.13 speedup for 20 updaters.

Of course, we can see that readers have slowed down, probably due to
increased update traffic, given there is no change to the read-side code
whatsoever.

Now let's see the penality of managing the stack for single-updater.
With 4 readers, single updater:

* Serialized grace periods :

./test_urcu_qsbr 4 1 20
SUMMARY ./test_urcu_qsbr          testdur   20 nr_readers   4
rdur      0 wdur      0 nr_writers   1 wdelay      0
nr_reads  19240784755 nr_writes      2130839 nr_ops  19242915594

* Batched grace periods :

./test_urcu_qsbr 4 1 20
SUMMARY ./test_urcu_qsbr          testdur   20 nr_readers   4
rdur      0 wdur      0 nr_writers   1 wdelay      0
nr_reads  19160162768 nr_writes      2253068 nr_ops  1916241583

2253068 vs 2137036 -> a couple of runs show that this difference lost in
the noise for single updater.

More benchmark results:

* Serialized synchronize_rcu() -- test_urcu_qsbr

./test_urcu_qsbr 4 1 20
SUMMARY ./test_urcu_qsbr          testdur   20 nr_readers   4 rdur      0 wdur      0 nr_writers   1 wdelay      0 nr_reads  18841016559 nr_writes      1857130 nr_ops  18842873689
./test_urcu_qsbr 4 20 20
SUMMARY ./test_urcu_qsbr          testdur   20 nr_readers   4 rdur      0 wdur      0 nr_writers  20 wdelay      0 nr_reads  20272811733 nr_writes      1837027 nr_ops  20274648760
./test_urcu_qsbr 12 12 20
SUMMARY ./test_urcu_qsbr          testdur   20 nr_readers  12 rdur      0 wdur      0 nr_writers  12 wdelay      0 nr_reads  60343516643 nr_writes      2353685 nr_ops  60345870328
./test_urcu_qsbr 16 8 20
SUMMARY ./test_urcu_qsbr          testdur   20 nr_readers  16 rdur      0 wdur      0 nr_writers   8 wdelay      0 nr_reads  78202711840 nr_writes      2326331 nr_ops  78205038171
./test_urcu_qsbr 20 4 20
SUMMARY ./test_urcu_qsbr          testdur   20 nr_readers  20 rdur      0 wdur      0 nr_writers   4 wdelay      0 nr_reads  94553396003 nr_writes      2238396 nr_ops  94555634399
./test_urcu_qsbr 20 3 20
SUMMARY ./test_urcu_qsbr          testdur   20 nr_readers  20 rdur      0 wdur      0 nr_writers   3 wdelay      0 nr_reads  95004708661 nr_writes      2165966 nr_ops  95006874627
./test_urcu_qsbr 20 2 20
SUMMARY ./test_urcu_qsbr          testdur   20 nr_readers  20 rdur      0 wdur      0 nr_writers   2 wdelay      0 nr_reads  95386506198 nr_writes      2194352 nr_ops  95388700550
./test_urcu_qsbr 20 1 20
SUMMARY ./test_urcu_qsbr          testdur   20 nr_readers  20 rdur      0 wdur      0 nr_writers   1 wdelay      0 nr_reads  84705972017 nr_writes      2609595 nr_ops  84708581612

* Batched synchronize_rcu() -- test_urcu_qsbr

./test_urcu_qsbr 4 1 20
SUMMARY ./test_urcu_qsbr          testdur   20 nr_readers   4 rdur      0 wdur      0 nr_writers   1 wdelay      0 nr_reads  19154850714 nr_writes      2238834 nr_ops  19157089548
./test_urcu_qsbr 4 20 20
SUMMARY ./test_urcu_qsbr          testdur   20 nr_readers   4 rdur      0 wdur      0 nr_writers  20 wdelay      0 nr_reads  15114131760 nr_writes      9370255 nr_ops  15123502015
./test_urcu_qsbr 12 12 20
SUMMARY ./test_urcu_qsbr          testdur   20 nr_readers  12 rdur      0 wdur      0 nr_writers  12 wdelay      0 nr_reads  45541854970 nr_writes      5786496 nr_ops  45547641466
./test_urcu_qsbr 16 8 20
SUMMARY ./test_urcu_qsbr          testdur   20 nr_readers  16 rdur      0 wdur      0 nr_writers   8 wdelay      0 nr_reads  66217337547 nr_writes      4257427 nr_ops  66221594974
./test_urcu_qsbr 20 4 20
SUMMARY ./test_urcu_qsbr          testdur   20 nr_readers  20 rdur      0 wdur      0 nr_writers   4 wdelay      0 nr_reads  95048642908 nr_writes      2416266 nr_ops  95051059174
./test_urcu_qsbr 20 3 20
SUMMARY ./test_urcu_qsbr          testdur   20 nr_readers  20 rdur      0 wdur      0 nr_writers   3 wdelay      0 nr_reads  96679609928 nr_writes      2211168 nr_ops  96681821096
./test_urcu_qsbr 20 2 20
SUMMARY ./test_urcu_qsbr          testdur   20 nr_readers  20 rdur      0 wdur      0 nr_writers   2 wdelay      0 nr_reads  92166219811 nr_writes      1968725 nr_ops  92168188536
./test_urcu_qsbr 20 1 20
SUMMARY ./test_urcu_qsbr          testdur   20 nr_readers  20 rdur      0 wdur      0 nr_writers   1 wdelay      0 nr_reads  87986181951 nr_writes      3278737 nr_ops  87989460688

CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
CC: Lai Jiangshan <laijs@cn.fujitsu.com>
CC: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agotests: use standard malloc/free for synchronize_rcu()
Mathieu Desnoyers [Mon, 12 Nov 2012 14:07:34 +0000 (09:07 -0500)] 
tests: use standard malloc/free for synchronize_rcu()

Allows removing mutex from tests, which allow testing scalability of
concurrent synchronize_rcu() executions.

CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
CC: Lai Jiangshan <laijs@cn.fujitsu.com>
CC: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agourcu-bp: move quiescent threads to separate list
Mathieu Desnoyers [Mon, 12 Nov 2012 03:33:34 +0000 (22:33 -0500)] 
urcu-bp: move quiescent threads to separate list

Accelerate 2-phase grace period by not having to iterate twice on
threads not within RCU read-side critical section.

CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
CC: Lai Jiangshan <laijs@cn.fujitsu.com>
CC: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agourcu-mb/signal/membarrier: move quiescent threads to separate list
Mathieu Desnoyers [Mon, 12 Nov 2012 03:32:28 +0000 (22:32 -0500)] 
urcu-mb/signal/membarrier: move quiescent threads to separate list

Accelerate 2-phase grace period by not having to iterate twice on
threads not nested within a RCU read-side lock.

CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
CC: Lai Jiangshan <laijs@cn.fujitsu.com>
CC: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
11 years agourcu-qsbr: move offline threads to separate list
Mathieu Desnoyers [Mon, 12 Nov 2012 03:31:28 +0000 (22:31 -0500)] 
urcu-qsbr: move offline threads to separate list

Accelerate 2-phase grace period by not having to iterate on offline
threads twice.

CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
CC: Lai Jiangshan <laijs@cn.fujitsu.com>
CC: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
This page took 0.05672 seconds and 4 git commands to generate.