pygraphistry
pygraphistry copied to clipboard
Deprecate DGL: freeze on CPU (torch >=2.5) + slow ABI movement
Summary
DGL support on CPU is effectively frozen and lags Torch ABI upgrades. The latest CPU wheels stop at DGL 2.1.0 (Torch ~2.0). Newer DGL versions (e.g., 2.4.0) publish only CUDA wheels. With Torch 2.5–2.9, we cannot run DGL-based features or tests on CPU without building DGL from source. We should move to PyTorch Geometric, which is actively maintained.
Evidence
- DGL wheel index (https://data.dgl.ai/wheels/repo.html) shows CPU wheels only up to 2.1.0; 2.4.0 is CUDA-only.
- Installing Torch 2.8/2.9 + DGL CPU fails (GraphBolt → torchdata pins to Torch 2.0.x).
- Our server Dockerfile uses Torch 2.9.0 + DGL 2.4.0 cu124 (GPU works), but CI CPU runners cannot match it.
- No CPU ABI updates from DGL for ~16 months.
Impact
- DGL-dependent tests (
embed_utils,networks) break or get skipped on CPU with current Torch. - CI CPU matrix cannot validate DGL paths, increasing regression risk.
Proposal
- Begin deprecating DGL and plan a migration to PyTorch Geometric (PyG) for GNN features.
- Document the supported Torch/DGL matrix:
- CPU: Torch ~2.0 + DGL 2.1.0 only
- GPU: DGL 2.4.0 cu124 + Torch 2.8/2.9
- Mark CPU DGL tests as
xfailor disable them until migration.
Tasks
- Update CI and guides to align with the supported matrix; explicitly skip CPU DGL tests.
- Draft the PyG migration plan (feature parity, loaders, batching).
- Update documentation to reflect DGL limitations and upcoming deprecation.