pachi icon indicating copy to clipboard operation
pachi copied to clipboard

cache locality project coordination

Open sthalik opened this issue 8 years ago • 2 comments

Hey,

Responding to @pasky's earlier request for work regarding struct tree_node cache locality.

The issue is not completely trivial, particularly due to threading and the arbitrary order in which nodes are traversed. This is not merely refactoring.

My best idea is to allocate pages, or several, on a per-thread basis. Get rid of sibling pointer, next sibling can be calculated by parent plus pointer arithmetic of parent->data with regard to given struct tree_node's address. There probably needs to be stored the thread's id as well as a reference count in the page's end or beginning.

Some questions remain, most importantly - can multiple threads add subnodes to the same node?

Another idea is to realloc(3) node's data keeping excess free capacity. But threading again.

sthalik avatar Mar 30 '16 11:03 sthalik

All subnodes are allocated in batch during the node expansion. Two threads can theoretically race to expand the same node, but right now that does no harm (except wasted CPU).

pasky avatar Mar 30 '16 19:03 pasky

@pasky there are two paths, one for fast_malloc after pruning and the second one.

The locality problem you presented before - can it be helped with using per-thread arenas? It's doable.

sthalik avatar Apr 15 '16 10:04 sthalik