huggingface-vscode-endpoint-server
huggingface-vscode-endpoint-server copied to clipboard
Refactor generators and add ct2fast support
Hello, in my fork I:
- refactored the generators
- Added support loading ctranslate2 based models (starcoderct2fast) which are incredibly fast on consumer hardware
- Added support to finding the model type and returning the correct class in the main function (from local or HF hub)
- support for websocket streaming
it is a WIP but if you want any of these features I ll be happy to create a proper PR
Sorry for taking so long to reply to you. In fact, I encourage everyone to submit pull requests directly, and I will carefully review each one.
Anything happened to this ticket?