wtpsplit icon indicating copy to clipboard operation
wtpsplit copied to clipboard

Control where the model is downloaded too?

Open awhillas opened this issue 2 years ago • 3 comments

Hi, This is more of a minor feature request. I'm trying to use NNSplit in a container, which has a read-only file system except for the /tmp dir. It would be groovy if one could provide a local path to load the model from/download to. Perhaps this is in the python interface already but i couldn't see it.

I know you can specify a path when calling NNSplit() but this gets more complcated as I'm including it in a modele that then gets included in another project.

Anyway, nice work and thanks!

awhillas avatar Dec 23 '21 01:12 awhillas

Hi, sorry for being late here.

I am not sure I understand your request correctly. Are you asking for a way to customize the cache directory (currently always ~/.cache/nnsplit? If not, please elaborate (maybe with an example).

bminixhofer avatar Jan 18 '22 13:01 bminixhofer

@bminixhofer seems like I've hit this case. I was trying to issue a NNSplit.load("en") on a server under a user with very limited access, the home dir of that user was not writable, and I was hitting Permission error 13.

Being able to optionally specify a cache directory optionally when loading would be great, and then I could point NNSplit to eg. /tmp/ as @awhillas noted. Something like NNSplit.load("en", cache_directory="/tmp")

synweap15 avatar Mar 10 '22 13:03 synweap15

@bminixhofer yes, that is exactly what i'm suggesting. Preferably by specifying an environmental variable to make it Dockerfile friendly

awhillas avatar Sep 05 '22 22:09 awhillas

Hi! Sorry for being so quiet on this library. I have been working on a major revamp, expanding support to 85 languages, switching to a new training objective without labelled data, and switching the backbone to a BERT-style model.

The models are loaded via the Huggingface Hub, see here for how to control the cache directory: https://huggingface.co/docs/transformers/installation?highlight=transformers_cache#cache-setup

bminixhofer avatar May 31 '23 11:05 bminixhofer