foldcomp
foldcomp copied to clipboard
Compressing protein structures effectively with torsion angles
Hi, just came across this repo, very nice project. I was thinking about the feasibility of supporting multi-chain structures, an option could maybe be storing the offset coordinates for each...
Hello! I've tried using `highquality_clust30` as a reference and identified the following issue. The database has around 200k repeated entries, they appear to be fragmented proteins split on `X` aminoacid....
https://github.com/steineggerlab/foldcomp/blob/b97c193e3029d861b6ca6b7c2970b562b779a4de/foldcomp/foldcomp.cxx#L333-L434 # Current behaviour Currently, there is no way to programmatically capture missing ids via Python interface. The missing ids are printed into non-capturable `stderr` .This would be a useful...
Hi! I used "pip install foldcomp" to install it in my system (WSL2 with conda as well as google colab). However it looks like foldcomp is not recognized because when...
Pip install from pypi doesn't work when using Python3.12 but does work using pip install with a local version of the git repo. Error on pip install from pypi [error.txt](https://github.com/steineggerlab/foldcomp/files/14773102/error.txt)...
When I extract `FASTA` from `highquality_clust30` I receive the following headers. ``` >ESMFOLD V0 PREDICTION FOR MGYP000138429313 >ESMFOLD V0 PREDICTION FOR MGYP001595280761 ... ``` I use `FoldComp` for a downstream...
I.e., when I extract FASTA from `afdb_swissprot_v4`: ``` foldcomp_id = AF-B1YUJ2-F1-model_v4 fasta_header = AF-B1YUJ2-F1-model_v4.pdb ``` When I extract FASTA from my personal db: ``` foldcomp_id = MIP_00183643.pdb fasta_header = MIP_00183643.pdb...
Hi! When using foldcomp compress on this file: https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/proteomics/pdb/1tim.pdb it breaks into 4 parts: 1tim.pdbA_0.fcz, 1tim.pdbA_1.fcz, 1tim.pdbB_0.fcz and 1tim.pdbB_1fcz Is this functionality desired? How to decompress into one pdb file...
Probably related to #39 I used the command from the issue to subset a database: ``` f"{foldcomp_binary} decompress -t 64 --db --id-list {txt_representatives} {full_af_db} {output_foldcomp}" ``` Now when I run...
I have a lot of data to compress, and they are stored in nested subdirectories (e.g. /Data/Protein/Mutation/...pdb). Default behavior of "foldcomp compress -r" seems to be to create an output...