masakhane-web icon indicating copy to clipboard operation
masakhane-web copied to clipboard

Update endpoint

Open Kabongosalomon opened this issue 2 years ago • 2 comments

Description

I'm seeing this error when adding an additional model

image

It might be related to the time it takes to download the model.

Kabongosalomon avatar Jan 20 '23 02:01 Kabongosalomon

@Zenthon @vukosim Ishe has finished the Sentence Alignment on the gov-za multilingual so he'll jump ahead of me and work on Masakhane. However, I do have an idea for masakhane on this bug so I just want to document it so Ishe might try it out.

We have the suspicion that the program runs out of memory when trying to add multiple models so perhaps we can try only dealing with one model at a time.

You can use python manage.py add_language to add language refs to the database and download them. Perhaps the /update endpoint should just be for downloading the models that have references in the database.

Note: when you python manage.py remove_language you should delete the directory storing the model for data coherence.

Once the models are downloaded, the client should list what models are available for translation. When the user picks a model - say eng-swa only that model should be loaded into memory. If the user picks another model - say eng-tiv, the old model should be overwritten and the new one placed into memory.

My only concern with this approach is that it takes a long time to load a model and that might affect UX.

lastrucci01 avatar Jan 24 '23 08:01 lastrucci01

@Kabongosalomon It seems that some of the language models config files are incorrect(their checkpoint files parameters are wrong or missing) hence why you can't add some language models such as the en-ln- model

I have tested however that you can add multiple languages at once. Example:

  1. en-sw-JW300 (Swahili)
  2. en-tiv- (Tiv)
  3. en-iso- (Isoko)

All 3 of those models were loaded at the same time without issues

idzingirai avatar Jan 24 '23 16:01 idzingirai