comparative-embedding-visualization
comparative-embedding-visualization copied to clipboard
A Jupyter widget for comparing two embeddings with shared labels by their confusion, neighborhoods, and size.
Comparative Embedding Visualization with cev
cev is an interactive Jupyter widget for comparing a pair of 2D embeddings with shared labels.
Its novel metric allows to surface differences in label confusion, neighborhood composition, and label size.
The figure shows data from Mair et al. (2022) that were analyzed with Greene et al.'s (2021) FAUST method.
The embeddings were generated with Greene et al.'s (2021) annotation transformation and UMAP.
cev is implemented with anywidget and builds upon jupyter-scatter.
Installation
Warning:
cevis new and under active development. It is not yet ready for production and APIs are subject to change.
pip install cev
Getting Started
import pandas as pd
from cev.widgets import Embedding, EmbeddingComparisonWidget
umap_embedding = Embedding.from_ozette(df=pd.read_parquet("../data/mair-2022-tissue-138-umap.pq"))
ozette_embedding = Embedding.from_ozette(df=pd.read_parquet("../data/mair-2022-tissue-138-ozette.pq"))
umap_vs_ozette = EmbeddingComparisonWidget(
umap_embedding,
ozette_embedding,
titles=["Standard UMAP", "Annotation-Transformed UMAP"],
metric="confusion",
selection="synced",
auto_zoom=True,
row_height=320,
)
umap_vs_ozette
See notebooks/getting-started.ipynb for the complete example.
Development
First, create a virtual environment with all the required dependencies. We highly recommend to use hatch, which installs and sync all dependencies from pyproject.toml automatically.
hatch shell
Alternatively, you can also use conda.
conda env create -n cev python=3.11
conda activate cev
Next, install cev with all development assets.
pip install -e ".[notebooks,dev]"
Finally, you can now run the notebooks with:
jupyterlab
Commands Cheatsheet
If using hatch CLI, the following commands are available in the default environment:
| Command | Action |
|---|---|
hatch run fix |
Format project with black . and apply linting with ruff --fix . |
hatch run fmt |
Format project with black . and apply linting with ruff --fix . |
hatch run check |
Check formatting and linting with black --check . and ruff .. |
hatch run test |
Run unittests with pytest in base environment. |
hatch run test:test |
Run unittests with pytest in all supported environments. |
Alternatively, you can devlop cev by manually creating a virtual environment and managing
dependencies with pip.
Our CI linting/formatting checks are configured with pre-commit.
We recommend installing the git hook scripts to allow pre-commit to run automatically on git commit.
pre-commit install # run this once to install the git hooks
This will ensure that code pushed to CI meets our linting and formatting criteria. Code that does not comply will fail in CI.
Release
releases are triggered via tagged commits
git tag -a vX.X.X -m "vX.X.X"
git push --follow-tags
License
cev is distributed under the terms of the Apache License 2.0.
Citation
If you use cev in your research, please cite the following preprint:
@article{manz2024cev,
title={A General Framework for Comparing Embedding Visualizations Across Class-Label Hierarchies},
url={osf.io/puxnf},
DOI={10.31219/osf.io/puxnf},
publisher={OSF Preprints},
author={Manz, Trevor and Lekschas, Fritz and Greene, Evan and Finak, Greg and Gehlenborg, Nils},
year={2024},
month={Apr}
}