mammoth
mammoth copied to clipboard
MAMMOTH: MAssively Multilingual Modular Open Translation @ Helsinki
A model component (e.g. the Swahili encoder) is likely to exist on multiple devices. Because each device samples its own task sequence, it is possible that when a gradient synchronization...
The existing translation server from OpenNMT-py was refurbished. A demo frontend was implemented using streamlit.
closes #63 . same idea as v2, didn't bother porting from it. does not embark bucket states, although this could maybe be done by picking the line indices from all...
Currently, the `--train_from` option does not include means of restoring corpora states, hence training resumes from the beginning of the bitexts. This entails resumed models are training on a subset...
Hi, Is it possible to use Mammoth for other seq2seq problems, such as multilingual video/image captioning? What I have in mind is to prepare video features in this format (batch,...
closes #60
Going through the existing catalogue of options listed in our docs, a number of them seem to not be plugged in. The list below is most likely not exhaustive. ###...
Currently, we only support training encoder-decoder models. We might want to support encoder-only (e.g. BERT) and decoder-only models (e.g. GPTs). This could be inferred automatically from the types of sharing...
Add a feature to learn virtual embeddings for prompt/prefix learning on a pretrained model. This would depend on #24 being implemented first.
Currently, we rely on custom-made layer / encoder definitions for our modules. Cf. for instance this class: https://github.com/Helsinki-NLP/mammoth/blob/c6a193b1cc16bf7140520c44712bcf82701ec87d/mammoth/modules/transformer_encoder.py#L13 This entails that any architectural variant we wish to test has to...