Lucien Thomas issues

Repositories
Issues
Comments

Results 3 issues of


                                            Lucien Thomas

fix kv cache issue with quantized_phi3 implementation

The current implementation of the quantized_phi3 model does not clear its kv cache between distinct prompts. This leads to errors when attempting to generate text sequentially with the same model...

SmolDocling model support

Would there be any interest in adding this model? https://huggingface.co/ds4sd/SmolDocling-256M-preview I toyed around with an implementation last night but most of my experience has been with text models and am...

I'm getting outrageously slow inference with both Text based and Visual Models

## Describe the bug Multiple minutes even with tiny models with a 256M param vision model (smolVlm), it's not just the time loading the model into ram, because if i...

bug