cugraph icon indicating copy to clipboard operation
cugraph copied to clipboard

Ktruss implementation

Open jnke2016 opened this issue 1 year ago • 2 comments

jnke2016 avatar Dec 13 '23 15:12 jnke2016

@jnke2016

I thought more about implementing https://github.com/rapidsai/cugraph/issues/3446#issuecomment-1499649664; especially the steps 6, 7, and 8.

After step 6, sort the (edge source, edge destination, triangle count) triplets using edge destination as the primary key and edge source as the secondary key; this is to implement step 8-2.

  1. Run thrust::stable_partition to place the edges with triangle counts smaller than K-2 at the end. These edges should be "unrolled" and removed (unrolled means undoing their contributions to triangle counts, see the Pierce & Sanders for additional details)<= run thrust::stable_partition instead of thrust::partition, so each partition is still sorted.
  2. While (# edges to be remove > 0) 8-1. Find (p, q, intersection of p&q's neighbor lists) for each edge to be removed. For each vertex r in the intersection, enumerate (p,r,-1) and (q,r,-1)<=call detail::nbr_intersection again but only for the edges to be removed. This covers the case the edge to be removed is a p->q edge. Note that we should eventually run step 6 in chunks as the aggregate size of the intersection indices can far exceed the number of edges. So, we cannot store all the intersection indices and we need to re-compute. This is a memory footprint vs compute trade-off. 8-2. For each edge to be removed (say (q,r)), find all the incoming edges of r, then query the existence of (p,q) assuming p is an incoming neighbor of q. Accumulate (p,q,-1) and (p,r,-1) if (p,q) exists. <= This covers the case the edge to be removed is a q->r edge. We can run thrust::lower_bound & thrust::upper_bound on (s, d, triangle count) triplets to find all the incoming edges of q. Note that we need to run searches on both partitions of the (edge source, edge destination, triangle count) triplet array. In multi-GPU, we need to inspect all the GPUs that can possibly have r as an edge destination. This may sound a bit complicated unless you fully understand the partitioning and you may just add if constexpr (multi_gpu) { CUGRAPH_FAIL("unimplemented."); } for the initial SG implementation. We can come back to this once we validated the SG implementation. 8-3. Similarly for each edge to be removed (say (p,r)), find all the incoming edges of r, then query the existence of (p,q) assuming q is an incoming neighbor of r. Accumulate (p,q,-1) and (q,r,-1). <= This covers the case the edge to be removed is a p->r edge. I assume there will be significant overlap in code for 8-2 & 8-3. 8-4. Shuffle similar to 6-3 in multi-GPU. 8-5. Update triangle counts based on triplets (similar to 6-4 & 6-5). 8-6. Mask out the edges to be deleted. Also, resize the (edge source, edge destination, triangle count) triplet array so the deleted edges will no longer be considered. 8-7. Identify the new set of edges to be deleted by running thrust::stable_partition again on the edge list based on newly update triangle counts.

Please read this and let me know when you are ready to sync again.

seunghwak avatar Jan 10 '24 08:01 seunghwak

@jnke2016 Please ping me if you have any questions about the comments.

seunghwak avatar Feb 29 '24 20:02 seunghwak

/merge

rlratzel avatar Mar 26 '24 22:03 rlratzel