diffusion-maps icon indicating copy to clipboard operation
diffusion-maps copied to clipboard

Fast computation of diffusion maps and geometric harmonics in Python. Moved to https://git.sr.ht/~jmbr/diffusion-maps

#+TITLE: Diffusion Maps and Geometric Harmonics for Python #+AUTHOR: Juan M. Bello Rivas #+EMAIL: [email protected] #+DATE: <2019-04-08 Mon>

  • Overview

The =diffusion-maps= library for Python provides a fast and accurate implementation of diffusion maps[fn:1] and geometric harmonics[fn:2]. Its speed stems from the use of sparse linear algebra and (optionally) graphics processing units to accelerate computations. The included code routinely solves eigenvalue problems 3 x faster than SciPy using GPUs on matrices with over 200 million non-zero entries.

The package includes a command-line utility for the quick calculation of diffusion maps on data sets.

Some of the features of the =diffusion-maps= module include:

  • Fast evaluation of distance matrices using nearest neighbors.

  • Fast and accurate computation of eigenvalue/eigenvector pairs using sparse linear algebra.

  • Optional GPU-accelerated sparse linear algebra routines.

  • Optional interface to the [[https://github.com/opencollab/arpack-ng][ARPACK-NG]] library.

  • Simple and easily modifiable code.

[fn:1] Coifman, R. R., & Lafon, S. (2006). Diffusion maps. Applied and Computational Harmonic Analysis, 21(1), 5–30. http://doi.org/10.1016/j.acha.2006.04.006

[fn:2] Coifman, R. R., & Lafon, S. (2006). Geometric harmonics: A novel tool for multiscale out-of-sample extension of empirical functions. Applied and Computational Harmonic Analysis, 21(1), 31–52. http://doi.org/10.1016/j.acha.2005.07.005

#+CAPTION: Geometric harmonics for $z = sin(x^2 + y^2)$. #+NAME: fig:geometric-harmonics [[./geometric-harmonics.png]]

  • Prerequisites

The library is implemented in Python 3.5+ and uses [[http://www.numpy.org/][NumPy]] and [[https://www.scipy.org/][SciPy]]. It is recommended to install [[https://mathema.tician.de/software/pycuda/][PyCUDA]] to enable the GPU-accelerated eigenvalue solver.

The =diffusion-maps= command can display the resulting diffusion maps using [[https://matplotlib.org/][Matplotlib]] if it is available.

  • Installation

Use ~python setup.py install~ to install on your system or ~python setup.py install --user~ for a user-specific installation.

  • Command-line utility

The ~diffusion-maps~ command reads data sets stored in NPY, CSV, or MATLAB's MAT format. The simplest way to use it is to invoke it as follows:

#+BEGIN_SRC bash diffusion-maps DATA-SET.NPY EPSILON-VALUE #+END_SRC

There exist parameters to save and visualize different types of results, to specify how many eigenvalue/eigenvector pairs to compute, etc. See the help page displayed by:

#+BEGIN_SRC bash diffusion-maps --help #+END_SRC

  • Additional documentation

[[http://www.sphinx-doc.org/en/stable/][Sphinx]]-based API documentation is available in the =doc/= folder. Run

#+BEGIN_SRC bash make -C doc html #+END_SRC

to build the documentation.

  • License

This code is released under the MIT license. See =LICENSE= for details.

  • Citation

If you use this code in publications, please cite it as:

  • Juan M. Bello-Rivas. (2017, May 20). jmbr/diffusion-maps 0.0.1 (Version 0.0.1). Zenodo. http://doi.org/10.5281/zenodo.581667
  • Acknowledgments

The =diffusion-maps= library has originally been written by Juan M. Bello-Rivas.

Others have further contributed to =diffusion-maps= by reporting problems, suggesting various improvements, or submitting actual code. Here is a list of these people. Help me keep it complete and exempt of errors.

  • Felix Dietrich,
  • Mahdi Kooshkbaghi,
  • Daniel Lehmberg,
  • Philipp Schuegraf