huggingface-vscode-endpoint-server
huggingface-vscode-endpoint-server copied to clipboard
Refactor generators and add ct2fast support
Hello, in my fork I:
- refactored the generators
- Added support loading ctranslate2 based models (starcoderct2fast) which are incredibly fast on consumer hardware
- Added support to finding the model type and returning the correct class in the main function (from local or HF hub)
- support for websocket streaming
it is a WIP but if you want any of these features I ll be happy to create a proper PR