cugraph
cugraph copied to clipboard
[DEBT] Input Batch IDs Don't Line Up With Output Batch IDs
#3789 resolved an issue where empty minibatches were dropped from the bulk sampler. The fix for this problem results in batch ids that may not match up with those provided as input. This is not an issue for cuGraph-DGL
and cuGraph-PyG
since both packages expect only the number of batches to match what is specified by the filename, which renumbering the remaining non-empty minibatches does.
However, in the case of debugging future bulk sampling issues, or if batch ids become important to end-users, this could cause issues. Ultimately, there should be a way to better handle empty batches, possibly just returning the input seeds (which may better line up with end user expectations), or some other solution.
Related to #4201