biotrainer icon indicating copy to clipboard operation
biotrainer copied to clipboard

Results 31 biotrainer issues
Sort by recently updated
recently updated
newest added

Predictions for a secondary structure model ([dataset](https://github.com/J-SNACKKB/FLIP/tree/main/splits/secondary_structure)) should about match those from the [prottrans paper](https://ieeexplore.ieee.org/document/9477085). This could also be used to create a new test for the inferencer module with...

good first issue
testing

After migrating from [bio_embeddings](https://github.com/sacdallago/bio_embeddings) to calculate embeddings directly in biotrainer for the provided sequences, it is now theoretically possible to allow for fine-tuning existing protein language models (pLMs) such as...

enhancement
refactoring

For the protein-protein interaction mode, singular values can't be concatenated by `torch.concat`. A reshaping like `embedding1.reshape(1)` would be necessary.

bug

It would be nice to have a tutorial how to use custom embedders with biotrainer. This way, new protein language models can be used directly in biotrainer without having to...

documentation

The ppi interaction mode is not yet compatible with all protocols yet. `sequence_to_class` have been tested throughout. Other per-sequence protocols should work as well. However, for per-residue tasks (`residue_to_class`), changes...

enhancement

This is a very worthwhile effort. Are you considering adding the BERT transformer encoder model and the associated masked language modeling task for pre-training? The task is actually the same...

After the cross_validation PR will be merged, parameter search for nested cross validation will be enabled. It would be nice to extend this behaviour also to hold_out cross validation. A...

enhancement
good first issue

As a researcher, it would be nice to have an automatic random baseline as a comparison for every run. This could be included in the final test metrics: `test set...

enhancement
good first issue

The LightAttention model used for residues_to_class protocol uses BatchNorm1D. However, if using a batch size of 1 is not possible with BatchNorm1D. Because a batch size of 1 is an...

bug
wontfix

Currently, at first the config file is loaded (but not completely sanity checked yet, for example biotrainer does not care if the input files actually exist, so embeddings might be...

refactoring