Bug in get_dependency_graph for datasets with multiple parents
Describe the bug
When a dataset has multiple parents the get_dependency_graph is not having the expected behaviour.
Example: I have created this example with 3 datasets that depend on each other like this figure:
When I retrieve this from get_dependency_graph I would expect to see something like the web interface. Despite that, I only see dependencies until layer 2: Dependecy graph: {'8485a2145a64457b9704f9dd288d2dbc': [], 'd45f7216f74f4097b7e3d8c27c81217b': ['94dcbd8aa0a345c5b5b6fd7a601d6ae3']}
Expected behaviour
I would expect that dependencies would propagate until layer 1 dataset
Environment
- Server type (self hostedl)
- ClearML SDK Version 2.0
- ClearML Server Version (Only for self hosted). WebApp: 2.0.0-613 • Server: 2.0.0-613 • API: 2.31
- Python Version 3.11
- OS Linux
Related Discussion
Related discussion: https://clearml.slack.com/archives/CTK20V944/p1749647076795919
Hi @tensorfreitas ! I think this is indeed a bug. Does calling dataset._repair_dependency_graph() fix the problem for now?
@eugen-ajechiloae-clearml That does indeed seem to work! If I call
ds._repair_dependency_graph()
print(ds.get_dependency_graph())
It reports the correct structure. Thank you!