CDGS
                                
                                 CDGS copied to clipboard
                                
                                    CDGS copied to clipboard
                            
                            
                            
                        Conditional Diffusion Based on Discrete Graph Structures for Molecular Graph Generation
CDGS
Conditional Diffusion Based on Discrete Graph Structures for Molecular Graph Generation - AAAI 2023
The extension version: Learning Joint 2D & 3D Diffusion Models for Complete Molecule Generation [Paper] [Code].
Dependencies
- pytorch 1.11
- PyG 2.1
For NSPDK evaluation:
pip install git+https://github.com/fabriziocosta/EDeN.git --user
Others see requirements.txt .
Training
QM9
CUDA_VISIBLE_DEVICES=0 python main.py --config configs/vp_qm9_cdgs.py --mode train --workdir exp/vpsde_qm9_cdgs
- Set GPU id via CUDA_VISIBLE_DEVICES.
- workdiris the directory path to save checkpoints, which can be changed to- YOUR_PATH. We provide the pretrained checkpoint in- exp/vpsde_qm9_cdgs.
- More hyperparameters in the config file configs/vp_qm9_cdgs.py
ZINC250k
# 256 hidden dimension
CUDA_VISIBLE_DEVICES=0 python main.py --config configs/vp_zinc_cdgs.py --mode train --workdir exp/vpsde_zinc_cdgs_256 --config.training.n_iters 2500000
# 128 hidden dimension
CUDA_VISIBLE_DEVICES=0 python main.py --config configs/vp_zinc_cdgs.py --mode train --workdir exp/vpsde_zinc_cdgs_128 --config.training.batch_size 128 --config.training.eval_batch_size 128 --config.training.n_iters 2500000
The pretrained checkpoints are provided in Google Drive 256ch and Google Drive 128ch.
Sampling
QM9
- EM sampling with 1000 steps
CUDA_VISIBLE_DEVICES=0 python main.py --config configs/vp_qm9_cdgs.py --mode eval --workdir exp/vpsde_qm9_cdgs --config.eval.begin_ckpt 200 --config.eval.end_ckpt 200
- Add --config.eval.nspdkif apply NSPDK evaluation.
- Change iteration steps through --config.model.num_scales YOUR_STEPS.
- Change sampling batch size --config.eval.batch_sizeto control GPU memory usage.
- DPM-Solver examples
# Order 3; 50 step
CUDA_VISIBLE_DEVICES=0 python main.py --config configs/vp_qm9_cdgs.py --mode eval --workdir exp/vpsde_qm9_cdgs --config.eval.begin_ckpt 200 --config.eval.end_ckpt 200 --config.sampling.method dpm3 --config.sampling.ode_step 50
# Order 2; 20 step
CUDA_VISIBLE_DEVICES=0 python main.py --config configs/vp_qm9_cdgs.py --mode eval --workdir exp/vpsde_qm9_cdgs --config.eval.begin_ckpt 200 --config.eval.end_ckpt 200 --config.sampling.method dpm2 --config.sampling.ode_step 20
# Order 1; 10 step
CUDA_VISIBLE_DEVICES=0 python main.py --config configs/vp_qm9_cdgs.py --mode eval --workdir exp/vpsde_qm9_cdgs --config.eval.begin_ckpt 200 --config.eval.end_ckpt 200 --config.sampling.method dpm1 --config.sampling.ode_step 10
ZINC250k
- EM sampling examples
# 1000 steps
CUDA_VISIBLE_DEVICES=0 python main.py --config configs/vp_zinc_cdgs.py --mode eval --workdir exp/vpsde_zinc_cdgs_256 --config.eval.begin_ckpt 250 --config.eval.end_ckpt 250 --config.eval.batch_size 800
# 200 steps
CUDA_VISIBLE_DEVICES=0 python main.py --config configs/vp_zinc_cdgs.py --mode eval --workdir exp/vpsde_zinc_cdgs_256 --config.eval.begin_ckpt 250 --config.eval.end_ckpt 250 --config.eval.batch_size 800 --config.model.num_scales 200
- DPM-Solver examples
# Order 3; 50 step
CUDA_VISIBLE_DEVICES=0 python main.py --config configs/vp_zinc_cdgs.py --mode eval --workdir exp/vpsde_zinc_cdgs_256 --config.eval.begin_ckpt 250 --config.eval.end_ckpt 250 --config.eval.batch_size 800 --config.sampling.method dpm3 --config.sampling.ode_step 50
Results
We provide molecules generated by CDGS: Google Drive.
Citation
@article{huang2023conditional,
  title={Conditional Diffusion Based on Discrete Graph Structures for Molecular Graph Generation},
  author={Huang, Han and Sun, Leilei and Du, Bowen and Lv, Weifeng},
  journal={arXiv preprint arXiv:2301.00427},
  year={2023}
}