biobtree
biobtree copied to clipboard
A bioinformatics tool to search, map and retrieve identifiers, keywords and attributes
Biobtree
Biobtree is a bioinformatics tool which allows mapping the bioinformatics datasets via identifiers and special keywors with simple or advance chain query capability.
Features
-
Datasets - supports wide datasets such as
EnsemblUniprotChEMBLHMDBTaxonomyGOEFOHGNCECOUniparcUnirefwith tens of more via cross references by retrieving latest data from providers -
MapReduce - processes small or large datasets based on users selection and build B+ tree based uniform local database via specialized MapReduce based tecnique with efficient storage usage
-
Query - Allow simple or advance chain queries between datasets with intiutive syntax which allows writing RDF or graph like queries
-
Genome - supports querying full Ensembl genomes coordinates with
transcript,CDS,exon,utrwith several attiributes, mapped datasets and identifiers such asortholog,paralogor probe identifers belongsAffymetrixorIllumina -
Protein - Uniprot proteins including protein features with variations and mapped datasets.
-
Chemistry -
ChEMBLandHMDBdatasets supported for chemistry, disease and drug releated analaysis -
Taxonomy & Ontologies -
TaxonomyGOEFOECOdata with mapping to other datasets and child and parent query capability -
Your data - Your custom data can be integrated with or without relation to other datasets
-
Web UI - Web interface for easy explorations and examples
-
Web Services - REST or gRPC services
-
R & Python - Bioconductor R and Python wrapper packages to use from existing pipelines easier with built-in databases
Usage
First install latest biobtree executable available for Windows, Mac or Linux. Then extract the downloaded file to a new folder and open a terminal in this new folder directory and starts the biobtree. Alternatively R and Python based biobtreeR and biobtreePy wrapper packages can be used instead of using the executable directly for eaiser integration.
Starting biobtree with target datasets or genomes
# build ensembl genomes by tax id with uniprot&taxonomy datasets
biobtree --tax 595,984254 -d "uniprot,taxonomy" build
# build datasets only
biobtree -d "uniprot,taxonomy,hgnc" build
biobtree -d "hgnc,chembl,hmdb" build
# once data is built start web for using ws and ui
biobtree web
# to see all options and datasets use help
biobtree help
Starting biobtree with built-in databases
# 4 built-in database provided with commonly studied datasets and organism genomes in order to speed up database build process
# Check following func doc for each database content
# https://github.com/tamerh/biobtreeR/blob/master/R/buildData.R
biobtree --pre-built 1 install
biobtree web
Builting databases updated regularly at least for each Ensembl release and all builtin database files along with configuration files are hosted in spererate github repository
Web service endpoints
# Meta
# datasets meta informations
localhost:8888/ws/meta
# Search
# i is the only mandatory parameter
localhost:8888/ws/?i={terms}&s={dataset}&p={page}&f={filter}
# Mapping
# i and m are mandatory parameters
localhost:8888/ws/map/?i={terms}&m={mapfilter_query}&s={dataset}&p={page}
# Retrieve dataset entry. Both paramters are mandatory
localhost:8888/ws/entry/?i={identifier}&s={dataset}
# Retrieve entry with filtered mapping entries. Only page parameter is optional
localhost:8888/ws/filter/?i={identifier}&s={dataset}&f={filter_datasets}&p={page}
# Retrieve entry results with page index. All the parameters are mandatory
localhost:8888/ws/page/?i={identifier}&s={dataset}&p={page}&t={total}
Publication
https://f1000research.com/articles/8-145
Building source
biobtree is written with GO for the data processing and Vue.js for the web application part. To build and the create biobtree executable install go>=1.13 and run
go build
To build the web application for development in the web directory run
npm install
npm run serve
To build the web package run
npm run build