Casper
Casper
Fix #1200 and #1174 with Python 3.8.13 The stream does not work anymore with the standard code presented. We have to change the threading approach to run it successfully. Additionally,...
### Version of this library. Using v1.41.0 of unicorn websocket and 0.12.2 of unicorn-fy. ### Solution to Issue cannot be found in the documentation or other Issues and also occurs...
Hi @PanQiWei, I would like to request support for MPT models as they are SOTA with a commercial license. MPT models (Base, Story-Writer, Instruct, Chat): https://huggingface.co/mosaicml/mpt-7b I found an implementation...
Background: PyTorch's DataLoader hangs on several machines (locally, VM, colab) because of the `num_workers` argument being excessive. Generally, when using multiple processes, we want to scale with the number of...
Hi MosaicML. AutoGPTQ is a package trying to provide support for quantizing various LLMs. However, to do so, a few requirements are needed. Here are a few issues: - MPTForCausalLM...
MosaicML released its MPT 30B version today with 8k context, with Apache 2.0 license.  ## Why you should support MPT 30B Let me present my argumentation for why MPT...
Hi @cg123, I am the author of [AutoAWQ](https://github.com/casper-hansen/AutoAWQ). After being in contact with TheBloke, it seems there are some issues with models from MergeKit. - Weights are not the same...
With AutoAWQ, we can fuse layers causing a 2-3x speedup directly by passing a `quantization_config`. If this argument can be supported, it will be possible to evaluate quantized models at...
Dear vLLM maintainers @WoosukKwon and @zhuohan123 (@Yard1), DeepSpeed has released its serving framework which claims to be faster than vLLM. The main speedup comes from [Dynamic SplitFuse](https://github.com/microsoft/DeepSpeed/tree/master/blogs/deepspeed-fastgen#b-dynamic-splitfuse-) which is a...
### Please check that this issue hasn't been reported before. - [X] I searched previous [Bug Reports](https://github.com/OpenAccess-AI-Collective/axolotl/labels/bug) didn't find any similar reports. ### Expected Behavior That the model can start...