OpusCleaner icon indicating copy to clipboard operation
OpusCleaner copied to clipboard

ValueError: large.bin has wrong file format!

Open eu9ene opened this issue 10 months ago • 0 comments

FastText model downloading fails quite often, especially when using the "large" model.

A workaround is to pre-download the model with wget:

filters_dir="/builds/worker/.local/lib/python3.10/site-packages/opuscleaner/filters"
wget -O "${filters_dir}/large.bin" https://dl.fbaipublicfiles.com/fasttext/supervised-models/lid.176.bin

I think requests.get is not robust enough without retries, so it just fails periodically and wget does a lot more to ensure reliable downloading.

eu9ene avatar Apr 17 '24 19:04 eu9ene