BioDownloader
BioDownloader copied to clipboard
Adding support for other downloads and sources
Just of the top of my head...
PDB
- Validation reports
- Electron Density Maps
- Others (listed in the PDBe API 'files' endpoint, e.g. http://www.ebi.ac.uk/pdbe/api/pdb/entry/files/1cbs)
UniProt
- Sequence annotations in other formats (http://www.uniprot.org/help/api_retrieve_entries)
DSSP
- Pre-computed DSSP from ftp://ftp.cmbi.ru.nl/pub/molbio/data/dssp (more info at http://swift.cmbi.ru.nl/gv/dssp/)
Ensembl
- Variants and other sequence annotations (more info at http://www.ensembl.org/info/data/ftp/index.html)
CATH
- Loads of additional data (more info at http://www.cathdb.info/download)
Have you considered a more modular structure?
A fetcher
sub-package with modules/objects for each file type? This would allow different servers to be set as well, i.e. the RCSB PDB offers PDB file assemblies amongst other options. This would also make such expansion a lot easier?!
@fsimkovic That would be much better going forward!
I guess we could have generic fetcher
class and then as many sub-classes as services/data sources we want, with proper abstraction and expandability.
I will start a new Project to help delineate what needs to be refactored... Feel free to jump in on this! Your ideas and changes are much appreciated!