Add option for preserving batch information when removing self loops.
🚀 Feature
It seems unlikely that a users calling dgl.remove_self_loop() (https://docs.dgl.ai/generated/dgl.remove_self_loop.html?highlight=remove_self_loop#dgl.remove_self_loop) would desire the side effect of batch information being removed. It can also make code confusing to read:
graph = load_batched_graphs()
batch_nodes = graph.batch_num_nodes()
graph = dgl.remove_self_loops(graph)
recalculate_batch_info(graph, batch_nodes)
I'm requesting we add a parameter preserve_batch to remove_self_loops:
def remove_self_loop(g, etype=None, preserve_batch=True)
Whether it should default to True or False would depend on what release it makes it in--i.e., are we fine changing the default behavior.
Just a note (for when searching in the code) that 2 of the occurrences above are remove_self_loops, but the real function name is singular, remove_self_loop.
This appears to be fixed by https://github.com/dmlc/dgl/pull/3119, but:
- the docs need to be updated,
- and we should add a unit test to ensure batch information continues to be preserved when removing self-loops (https://github.com/dmlc/dgl/blob/master/tests/compute/test_transform.py#L1750).