Results 10 issues of Marco

I have modified inference/translate.py to allow batch text translations to being able to send a list of text requests to batch the processing for better efficency.

CLA Signed

Traceback (most recent call last): File "/home/marco/Scrivania/TESI/serving/vllm/vllm/engine/async_llm_engine.py", line 28, in _raise_exception_on_finish task.result() File "/home/marco/Scrivania/TESI/serving/vllm/vllm/engine/async_llm_engine.py", line 359, in run_engine_loop has_requests_in_progress = await self.engine_step() File "/home/marco/Scrivania/TESI/serving/vllm/vllm/engine/async_llm_engine.py", line 338, in engine_step request_outputs =...

Hi! I'm currently implementing lora in order to be able to train this model on lower end consumer grade gpus. For the liner proj there are no issues and the...

I'm working on a forecasting problem where my client has data for around 5 years, but the sales events occur sporadically. These sales events typically last about a week out...

# What does this PR do? - Added support for Mixture of Depths - Added example usage for MoD - Modified the Readme file

# What does this PR do? Everything had been tested with tiny models it should work out of the box even with bigger model I've modified the freeze module cycle...

pending

# What does this PR do? Add support for Mixture of Depths Atm if qwen is loaded an error is thrown, wen the qwen issue is resolved I'll remove it

Your installation procss is broken. even gradio 3.9 will not work, please update requirements.txt

We are encountering a particular problem, as described in [here](https://github.com/astramind-ai/BitMat/issues/7) our BMM kernelr un fine, unsless the matmul is between specific saped values. --- ```python def pack_ternary(x, n_element_in_one_int=4): """ Pack...

Hi, thanks for your amazing work I would love to know if you have any update on the sparse training script, or maybe if you are planning to quantize vicuna...