CellSpace icon indicating copy to clipboard operation
CellSpace copied to clipboard

Scalable sequence-informed embedding of single-cell ATAC-seq data with CellSpace

CellSpace

CellSpace is a sequence-informed embedding method for scATAC-seq that learns a mapping of DNA k-mers and cells to the same space.

See our pre-print for more details.

Installation and Usage

  1. Compile the C++ program to use as a command line tool to train a CellSpace model.

CellSpace, which uses the C++ implementation of StarSpace [Wu et al., 2017], builds on modern Mac OS and Linux distributions. It requires a compiler with C++11 support and a working make.

Install Boost library and specify the path of the library in the makefile (set variable BOOST_DIR). The default path will work if you install Boost by:

wget https://boostorg.jfrog.io/artifactory/main/release/1.63.0/source/boost_1_63_0.zip
unzip boost_1\_63_0.zip
sudo mv boost_1\_63_0 /usr/local/bin

Download and build CellSpace:

git clone https://github.com/zakieh-tayyebi/CellSpace.git
cd CellSpace/cpp/
make
export PATH=$(pwd):$PATH

Verify that it was successfully compiled:

CellSpace --help
  1. Install the R package to use the trained CellSpace model for downstream analysis.

Run the following commands in R:

install.packages("devtools")
devtools::install_github("https://github.com/zakieh-tayyebi/CellSpace.git")
library(CellSpace)

Installation should take only a few minutes. For details about the R functions, please refer to the API.

  1. A tutorial on CellSpace usage can be found here.

Citation

Please cite the arXiv paper if you use CellSpace:

Tayyebi, Z., Pine, A. R., & Leslie, C. S. (2023). Scalable sequence-informed embedding of single-cell ATAC-seq data with CellSpace. BioRxiv. https://doi.org/10.1101/2022.05.02.490310

Contact