PeptDeep-HLA
PeptDeep-HLA copied to clipboard
DL model to predict HLA peptide presentation
PeptDeep-HLA
A deep learning-model that predicts if a HLA peptide is present or not.
This is a sub-package of AlphaPeptDeep, and see our publication for details.
Quick start
Use Colab to train the models and predict HLA peptides, see:
- Training from scratch: nbs/HLA1_Classifier.ipynb
- Transfer learning: nbs/HLA1_transfer.ipynb
Installation
After installing anaconda, please clone and install this package using commands below:
cd path/to/place/this/package
git clone https://github.com/MannLabs/PeptDeep-HLA.git
cd PeptDeep-HLA
pip install .
Or install directly via pip:
pip install git+https://github.com/MannLabs/PeptDeep-HLA
CLI
After installation, we can use command line interface (CLI) to train sample-specific HLA models and predict HLA peptides either from fasta files or from peptide tables. Type the command below will show usage messages.
peptdeep_hla class1 -h
Here are the details of the CLI parameters/options:
-
--prediction_save_as TEXT: File to save the predicted HLA peptides [required]
-
--fasta TEXT: The input fasta files for training and prediction, multiple fasta files are supported, such as:
--fasta 1.fasta --fasta 2.fasta .... If--peptide_file_to_predictis provided, these fasta files will be ignored in prediction. -
--peptide_file_to_predict TEXT: Peptide file for prediction. It is an txt/tsv/csv file which contains peptide sequences in
sequencecolumn to be predicted. If not provided, this program will predict peptides from fasta files. Multiple files are supported. Optional, default is empty. -
--pretrained_model TEXT: The input model for transfer learning or prediction. Optional, default is the built-in pretrained model.
-
--prob_threshold FLOAT: Predicted probability threshold to discriminate HLA peptides. Optional, default=0.7.
-
--peptide_file_to_train TEXT: Peptide file for transfer learning. It is an txt/tsv/csv file which contains true HLA peptide sequences in
sequencecolumn for training. Multiple files are supported. Optional, default is empty. -
--model_save_as TEXT: File to save the transfer learned model. Optional, applicable if
--peptide_file_to_trainis provided. -
--predicting_batch_size INTEGER: The larger the better, but it depends on the GPU/CPU RAM. Optional, default=4096.
-
--training_batch_size INTEGER: Optional, default=1024.
-
--training_epoch INTEGER: Optional, default=40.
-
--training_warmup_epoch: INTEGER Optional, default=10.
-
--min_peptide_length INTEGER: Optional, default=8.
-
--max_peptide_length INTEGER: Optional, default=14.
-
-h, --help Show this message and exit.
For example, use the following command to predict from fasta without trainfer learning:
peptdeep_hla class1 --fasta /Users/zengwenfeng/Workspace/Data/fasta/irtfusion.fasta --prediction_save_as /Users/zengwenfeng/Workspace/Data/fasta/irt_hla.tsv
Notebook
Using Jupyter notebooks might be easier if users are not familiar with CLI.
HLA1_Classifier.ipynb. We used this notebook to train the pretrained models:
- HLA1_IEDB.pt: the LSTM model trained with HLA1 sequeces from IEDB. This is the default pretrained model in peptdeep_hla.
- HLA1_94.pt: the LSTM model trained with 94 allele types.
HLA1_transfer.ipynb. A simple example of transfer learning to train the sample-specific model.
Spectral libraries
After HLA peptides are predicted, we can then use these peptides to predict spectral libraries with AlphaPeptDeep for HLA DIA analysis.
Citations
Wen-Feng Zeng, Xie-Xuan Zhou, Sander Willems, Constantin Ammar, Maria Wahle, Isabell Bludau, Eugenia Voytik, Maximillian T. Strauss & Matthias Mann. AlphaPeptDeep: a modular deep learning framework to predict peptide properties for proteomics. Nat Commun 13, 7238 (2022). https://doi.org/10.1038/s41467-022-34904-3