text-generation-inference
                                
                                 text-generation-inference copied to clipboard
                                
                                    text-generation-inference copied to clipboard
                            
                            
                            
                        display available cached versions in TGI server error message of Neuron backend
Pulling from https://github.com/huggingface/optimum-neuron/pull/776
If a model is cached with a different configuration, I want to display alternative options to the user.
If someone copies from the deploy code on Hugging Face and changes something (e.g. sequence length), it is not obvious why it isn't working from this code. (especially if they don't understand compiling because they are referencing the original model)
Based on a true story!
added some carriage returns to make it more readable
get_hub_cached_entries does generate an error if it is fed a model that doesn't have a model_type. For example: (randomly selected) model_id = "hexgrad/Kokoro-82M"
Traceback (most recent call last): File "", line 1, in File "/opt/aws_neuronx_venv_pytorch_2_1/lib/python3.10/site-packages/optimum/neuron/utils/hub_cache_utils.py", line 431, in get_hub_cached_entries model_type = target_entry.config["model_type"] KeyError: 'model_type'
However, we already call that function inside of is_cached at the top of this block, so I don't know if we are filtering for certain types of models before we get to this point or not. If not, the existing code would generate that error before it ever gets here.
Before submitting
- [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
- [x ] Did you read the contributor guideline, Pull Request section?
- [ ] Was this discussed/approved via a Github issue or the forum? Please add a link to it if that's the case.
- [ ] Did you make sure to update the documentation with your changes? Here are the documentation guidelines, and here are tips on formatting docstrings.
- [ ] Did you write any new necessary tests?
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag members/contributors who may be interested in your PR.