cugraph icon indicating copy to clipboard operation
cugraph copied to clipboard

[DEBT] Input Batch IDs Don't Line Up With Output Batch IDs

Open alexbarghi-nv opened this issue 1 year ago • 1 comments

#3789 resolved an issue where empty minibatches were dropped from the bulk sampler. The fix for this problem results in batch ids that may not match up with those provided as input. This is not an issue for cuGraph-DGL and cuGraph-PyG since both packages expect only the number of batches to match what is specified by the filename, which renumbering the remaining non-empty minibatches does.

However, in the case of debugging future bulk sampling issues, or if batch ids become important to end-users, this could cause issues. Ultimately, there should be a way to better handle empty batches, possibly just returning the input seeds (which may better line up with end user expectations), or some other solution.

alexbarghi-nv avatar Aug 15 '23 14:08 alexbarghi-nv

Related to #4201

alexbarghi-nv avatar Feb 28 '24 18:02 alexbarghi-nv