wgbs_tools
wgbs_tools copied to clipboard
tools for working with Bisulfite Sequencing data while preserving reads intrinsic dependencies
wgbstools - suite for DNA methylation sequencing data representation, visualization, and analysis
wgbstools is an extensive computational suite tailored for bisulfite sequencing data. It allows fast access and ultra-compact representation of high-throughput data, as well as machine learning and statistical analysis, and informative visualizations, from fragment-level to locus-specific representations.
It converts data from standard formats (e.g., bam, bed) into tailored compact yet useful and intuitive formats (pat, beta). These can be visualized in terminal, or analyzed in different ways - subsample, merge, slice, mix, segment and more.
This project is developed by Netanel Loyfer and Jonathan Rosenski in Prof. Tommy Kaplan's lab at the Hebrew University, Jerusalem, Israel.
Quick start
Installation
# Clone
git clone https://github.com/nloyfer/wgbs_tools.git
cd wgbs_tools
# compile
python setup.py
Genome configuration
At least one reference genome must be configured (takes a few minutes).
wgbstools init_genome GENOME_NAME
# E.g,
wgbstools init_genome hg19
wgbstools init_genome mm9
wgbstools
downloads the requested reference FASTA file from the UCSC website.
If you prefer using your own reference FASTA, specify the path to the FASTA as follows.
wgbstools init_genome GENOME_NAME --fasta_path /path/to/genome.fa
Dependencies
- python 3+, with libraries:
- pandas version 1.0+
- numpy
- scipy
- samtools
- tabix / bgzip
Dependencies for some features:
- bedtools
Usage examples
Now you can generate pat.gz
and beta
files out of bam
files:
wgbstools bam2pat Sigmoid_Colon_STL003.bam
# output:
# Sigmoid_Colon_STL003.pat.gz
# Sigmoid_Colon_STL003.beta
Once you have pat
and beta
files, you can use wgbstools to visualize them. For example:
wgbstools vis Sigmoid_Colon_STL003.pat.gz -r chr3:119528843-119529245
data:image/s3,"s3://crabby-images/06b83/06b83a9fb3da6f06f024fc221b6d5fb1abc14f69" alt=""
wgbstools vis *.beta -r chr3:119528843-119529245 --heatmap
data:image/s3,"s3://crabby-images/6a1b7/6a1b72f3c1a4972a6b92989f6dedf554468393a9" alt=""
Deconvolution
To deconvolve tissues or blood samples, see our UXM software
References
If you are using wgbstools, please cite:
Loyfer et al. (2024) ‘wgbstools: A computational suite for DNA methylation sequencing data representation, visualization, and analysis’, bioRxiv ,2024.
[GEO GSE186458 | Genome browser sessions: hg19 | hg38]