dgl
dgl copied to clipboard
problem in batching wrt schemes and subgraphs
🐛 Bug
When batching graphs, schemes can be incorrectly checked, in the case were the scheme is present but the number of edge with this scheme is 0
To Reproduce
-
create two simple hetero graphs
g1 = dgl.heterograph({("n", "r1", "n"): ([0],[1]), ("n", "r2", "n"):((),())})
g2 = dgl.heterograph({("n", "r1", "n"): ([0],[1]), ("n", "r2", "n"):((),())})
-
add a new edge to g2, with some data
g2.add_edges([0],[2], etype="r2", data = { "r": torch.tensor([1])})
-
extract subgraphs , removing completely edge type "r2"
g3 = dgl.node_subgraph(g1,[0,1])
g4 = dgl.node_subgraph(g2,[0,1])
-
we have that :
g3.edge_attr_schemes(etype="r2")
gives{'_ID': Scheme(shape=(), dtype=torch.int64)}
g3.num_edges(etype="r2")
gives0
g4.num_edges(etype="r2")
also gives0
(as node 2 was removed, so was the only edge of type "r2") BUTg4.edge_attr_schemes(etype="r2")
gives{'r': Scheme(shape=(), dtype=torch.int64), '_ID': Scheme(shape=(), dtype=torch.int64)}
-
so, when trying to batch this stuff:
b = dgl.batch([g3,g4])
we get :
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.10/dist-packages/dgl/batch.py", line 217, in batch
ret_feat = _batch_feat_dicts(
File "/usr/local/lib/python3.10/dist-packages/dgl/batch.py", line 247, in _batch_feat_dicts
utils.check_all_same_schema(schemas, feat_dict_name)
File "/usr/local/lib/python3.10/dist-packages/dgl/utils/checks.py", line 207, in check_all_same_schema
raise DGLError(
dgl._ffi.base.DGLError: Expect all graphs to have the same schema on edges[('n', 'r2', 'n')].data, but graph 1 got
{'r': Scheme(shape=(), dtype=torch.int64), '_ID': Scheme(shape=(), dtype=torch.int64)}
which is different from
{'_ID': Scheme(shape=(), dtype=torch.int64)}.
Expected behavior
correct batching, either by removing scheme when number of a given edge type becomes 0, or any other mean
Environment
- DGL Version (e.g., 1.0): 1.1.2
- Backend Library & Version (e.g., PyTorch 0.4.1, MXNet/Gluon 1.3): pytorch 2.1.0
- OS (e.g., Linux): Linux
- How you installed DGL (
conda
,pip
, source): pip - Build command you used (if compiling from source):
- Python version: 3.10
- CUDA/cuDNN version (if applicable): 11.7
- GPU models and configuration (e.g. V100):
- Any other relevant information:
Yes, it's a current limitation that mutating a batched graph structure will corrupt its batching information. Could you try mutating graphs before batching?
Hi. I am not sure to understand. In my example batching is done as a last step, could you explain to me what means "mutating graphs before batching" ? Thanks a lot
This issue has been automatically marked as stale due to lack of activity. It will be closed if no further activity occurs. Thank you