add option to download default models
This PR adds an option when installing py-feat via pip to also download the default models.
pip install py-feat[default_models]
@ejolly Let's make sure this works first before merging as I havent really tested it yet.
@ljchang Unfortunately this isn't going to work because you can only include package names in extras_require.
In fact pip doesn't seem to support any kind of post-install that isn't simply installing other packages due to security issues. That's also why they suggest including any package data within the package if you need it. I don't think that makes sense for us as our pip installs would be huge and would tie model weights to package versions.
I've added an alternative solution, which is a compromise, but still a little annoying:
- User
pip install py-feat - User runs
feat_get_modelscommand their terminal which will be automatically setup after they pip install
It's not that different that simply downloading the models on first run of Detector, so I'm torn about whether it's worth adding. What do you think?
@ejolly, I've only scratched the surface of my deep dive into hugging face repositories, but I definitely think this is the way to go. I'm going to keep adding notes here as I learn more.
- I've created an organization for the lab to host datasets or model repositories.
- models and datasets can be public or private and can solicit community feedback or block it.
- models and datasets can be versioned
- webhooks are possible . One thing I've wanted for a long time is to build a benchmarking server, which I think will be possible with hugging face. We can post our test data as private to hugging face (our EULAs prevent us from making it public). Everytime a model is updated or a new one is added, we can add a webhook to run our benchmarking tests on that model or all of them. Honestly, I don't care if we have to pay for compute time on one of their spaces, this would be amazing and would enable a living benchmark for py-feat.
- there is a python cli for working with repositories and model i/o.
- models can be standalone and dowloaded, OR they can be integrated into a code repository . I think we would want to do this so you can download models from py-feat, just like you can from the transformers library.
- jupyter notebooks can be rendered and linked to colab. This could be nice for demos or tutorials
- We should do a deepdive into the possibility of porting py-feat to be an integrated library
- There are widgets to create live demos for each model. Not sure this will work for us or not.
this is addressed in issue #221
Subsumed by #228