sympa
sympa copied to clipboard
Embedding graphs in symmetric spaces
sympa: Symmetric Spaces for Graph Embeddings
Code for the papers "Symmetric Spaces for Graph Embeddings: A Finsler-Riemannian Approach" published at ICML 2021, and the paper "Hermitian Symmetric Spaces for Graph Embeddings" published at DiffGeo4DL @ NeurIPS 2020.
Available Models
Vector models:
-
euclidean
-
poincare
-
lorentz
-
sphere
-
prod-hysph
: Product of hyperbolic x sphere -
prod-hyhy
: Product of hyperbolic x hyperbolic -
prod-hyeu
: Product of hyperbolic x euclidean
Matrix models:
-
spd
: Symmetric positive definite matrices space -
upper
: Upper half space model of the Siegel space -
bounded
: Bounded domain model of the Siegel space -
dual
: Compact dual
The last three allow different metrics
Available metrics
-
riem
: Riemannian metric -
fone
: Finsler One -
finf
: Finsler Infinity -
fmin
: Finsler metric of minimum entropy -
wsum
: Learns weights for a weighted sum of the vector-valued distance
Requirements
- Python == 3.7
- Pytorch == 1.5.1:
conda install pytorch==1.5.1 torchvision==0.6.1 [cpuonly | cudatoolkit=10.2] -c pytorch
. In CPU environments, data parallel is not stable with pytorch >= 1.6 -
Geoopt >= 0.3.1: install from repository is advised:
pip install git+https://github.com/geoopt/geoopt.git
- XiTorch: for working with the compact dual only
- networkx and networkit: for preprocessing only
- matplotlib: for preprocessing only
- tensorboardx
- tqdm
Running experiments
1. Preprocess Data
In all preprocessing cases, the option --run_id=RUN_ID
is required. The data will be saved in data/RUN_ID
.
If --plot_graph
is passed, a plot will be generated, but the plotting can take a long time if the graph is large.
Grids:
python preprocess.py --graph=grid --grid_dims=DIMS --nodes=NODES --run_id=RUN_ID
It will create a grid of DIMS
dimensions with int(NODES^(1/DIMS))
nodes.
Ex: python preprocess.py --graph=grid --grid_dims=3 --nodes=27
will create a 3x3x3 cube graph
Trees
python preprocess.py --graph=tree --tree_branching=BRANCHING --tree_height=HEIGHT --run_id=RUN_ID
It will create a tree with branching factor BRANCHING
and height HEIGHT
.
Cartesian or Rooted products
By default it will create a cartesian/rooted product of a tree and a grid, but it can be modified by changing the order in the code
python preprocess.py --graph=product-cartesian --grid_dims=DIMS --nodes=NODES --tree_branching=BRANCHING --tree_height=HEIGHT --run_id=RUN_ID
It will create a cartesian (rooted with --graph=product-rooted
) product of the specified tree and grid.
Social Networks
python preprocess.py --graph=TYPE --run_id=RUN_ID
Current available options are social-karate
, social-davis
, social-florentine
, social-miserables
.
See NetworkX doc
Expanders
python preprocess.py --graph=TYPE --nodes=NODES --run_id=RUN_ID
Current available options are expander-margulis
, expander-chordal
, expander-paley
.
See NetworkX expanders doc
Custom
python preprocess.py --graph=NAME --run_id=RUN_ID
It will look for a file in data/NAME/NAME.edges
where the graph should be represented as:
src_node1 dst_node1 [weight1]
src_node2 dst_node2 [weight2]
...
where src_node
and dst_node
are int
values and weight
is an optional float
value.
2. Train Graph Embeddings
python -m torch.distributed.launch --nproc_per_node=N_CPUS --master_port=2055 train.py \\
--n_procs=N_CPUS \\
--data=PREP \\
--run_id=RUN_ID \\
--results_file=out/results.csv \\
--model=MODEL \\
--metric=riem \\
--dims=4 \\
--learning_rate=1e-2 \\
--val_every=25 \\
--patience=50 \\
--max_grad_norm=100 \\
--batch_size=2048 \\
--epochs=1000
Experiments can be run distributed over multiple CPUs/GPUs with N_CPUS
.
PREP
must be the name of the graph to embed (what in step 1 was called RUN_ID
).
Results will be reported in results_file
with run_id
as the name.
For model
and metric
see Available Models
Considerations
The method inner
is implemented for both the Upper Half space and the Bounded domain model.
With this, experiments can be run with RiemannianAdam
.
However, we found them to be very unstable, therefore all experiments reported in the paper were run with RiemannianSGD
TODO
- [ ] Fix broken tests
- [ ] Add Finsler Metrics to SPD manifold
- [ ] Merge branch with Recommender System experiments into master
- [ ] Merge branch with ploting tools into master
Citation
The source code and data in this repository aims at facilitating the study of graph embeddings in symmetric spaces. If you use the code/data, please cite it as follows:
@InProceedings{lopez2021symmetric,
title = {Symmetric Spaces for Graph Embeddings: A Finsler-Riemannian Approach},
author = {L\'opez, Federico and Pozzetti, Beatrice and Trettel, Steve and Strube, Michael and Wienhard, Anna},
booktitle = {Proceedings of the 38th International Conference on Machine Learning},
pages = {7090--7101},
year = {2021},
editor = {Meila, Marina and Zhang, Tong},
volume = {139},
series = {Proceedings of Machine Learning Research},
month = {18--24 Jul},
publisher = {PMLR},
pdf = {http://proceedings.mlr.press/v139/lopez21a/lopez21a.pdf},
url = {http://proceedings.mlr.press/v139/lopez21a.html}
}