RFdiffusion icon indicating copy to clipboard operation
RFdiffusion copied to clipboard

[Bug] ModuleNotFoundError: No module named 'torchdata.datapipes' on Linux aarch64 (DGL incompatibility)

Open rudgmleo opened this issue 5 months ago • 2 comments

Okay, here is the English version of your inquiry, complete and ready for posting on a GitHub Issues page.

Title:

[Bug] ModuleNotFoundError: No module named 'torchdata.datapipes' on Linux aarch64 (DGL incompatibility)

Body:

Dear RFdiffusion / DGL Team,

I am writing to report a persistent issue I'm encountering while trying to set up RFdiffusion on a Linux aarch64 (ARM64) system. I would greatly appreciate any guidance or known solutions.

Problem Description: When attempting to import RFdiffusion's core modules, I consistently get a ModuleNotFoundError: No module named 'torchdata.datapipes'. This error originates within the dgl library, specifically when it tries to import the datapipes submodule from torchdata.

Environment:

  • Operating System: Linux aarch64 (Ubuntu 24.04.1 LTS)
  • Python Version: 3.9 (Conda environment)
  • PyTorch Version: 2.0.0 (CPU-only build from conda-forge)
  • TorchVision Version: 0.15.2
  • TorchAudio Version: 2.0.0 (installed via pip)
  • CUDA Toolkit: 11.8 (Installed via Conda, but PyTorch build is CPU)
  • DGL Version:
    • conda install attempts for versions 0.8.1, 0.9.1, 1.0.0, 1.1.2 (+cu118) resulted in PackagesNotFoundError (Conda could not find/resolve them).
    • pip install dgl (latest, 2.1.0) successfully installs DGL.
  • TorchData Version: 0.11.0 (installed via pip)
  • RFdiffusion Source: https://github.com/RosettaCommons/RFdiffusion
  • ColabDesign Source: https://github.com/sokrypton/ColabDesign

Steps to Reproduce:

  1. Switched to a Python 3.9 environment.
    • Created a base environment using conda create -n SE3nv python=3.9.
    • Installed PyTorch ecosystem (pytorch==2.0.0, torchvision==0.15.2, cudatoolkit=11.8) from conda-forge. (This step succeeded.)
    • Installed DGL latest version (2.1.0) via pip install dgl. (This step succeeded.)
    • Installed TorchAudio 2.0.0 via pip install torchaudio==2.0.0. (This step succeeded.)
    • Installed RFdiffusion/env/SE3Transformer and ColabDesign (pip install -e .).
    • Installed remaining pip dependencies (omegaconf, icecream, pyrsistent, matplotlib, ipywidgets, py3Dmol, jupyterlab, etc.).
  2. Attempted python -c "from inference.utils import parse_pdb".

Expected Behavior: RFdiffusion modules should import successfully.

Actual Behavior (Traceback): (Please copy and paste the full traceback from your most recent ModuleNotFoundError: No module named 'torchdata.datapipes' here. For example:) Traceback (most recent call last): File "

Additional Context (What I've Tried):

  • Attempted installation in Python 3.10 environment, but failed due to DGL, TorchData, and TorchAudio version conflicts.
  • For Python 3.9, conda install attempts for DGL versions (0.8.1, 0.9.1, 1.0.0, 1.1.2 with cu118 label) from dglteam and defaults channels consistently failed with PackagesNotFoundError. This indicates Conda cannot find or resolve these aarch64 builds.
  • Manually checked ~/miniconda3/envs/SE3nv/lib/python3.9/site-packages/torchdata/ via ls -l and confirmed that the datapipes directory is physically missing. This strongly suggests a structural mismatch between DGL 2.1.0's requirements and the available TorchData 0.11.0 build for aarch64.

Question: Are there any known working dgl / torchdata version combinations or specific installation instructions for linux-aarch64 that successfully resolve this torchdata.datapipes issue? Are there any official aarch64 Docker images or specific Dockerfile modifications known to work?

Thank you for your time and assistance.

rudgmleo avatar Jul 22 '25 09:07 rudgmleo

Hello, datapipes is no longer part of torchdata as of around June 2024. Please follow the installation steps here: RFdiffusion Installation Guide, the SE3nv.yml should set up the correct environment for you.

There is also a docker image that you can use: RFdiffusion Docker Images - note that if you are using this on an HPC system this page also has instructions for running the image with Apptainer/Singularity.

rclune avatar Jul 22 '25 17:07 rclune

There is currently not a Rosetta Commons Docker image that supports this architecture nor are we aware of installation instructions that will work with linux-aarch64 systems. We are looking into how to update the dependencies to avoid this issue, but currently do not have a timeline for this change.

rclune avatar Jul 25 '25 23:07 rclune