Alexey Mametyev issues

Results 10 issues of


                                            Alexey Mametyev

trafficstars

Provide logits or logprobs in the API

Feature request: How can i get logits (probabilites of each next token), during generation, just like I can do it in Open AI API (logprobs)? This feature will be helpfull...

feature request

api

Problem with installation

I'v tried to install unsloth to my server using pip, but pip can't found required version: ``` %pip install "unsloth[cu121_ampere_torch211] @ git+https://github.com/unslothai/unsloth.git" Defaulting to user installation because normal site-packages is...

fixed - pending confirmation

Turn off gravity for rigid body

I want to turn off gravity for one scene, but methods from pymunk docs does not works. Firstly I tryied to change scene space gravity paramether in construct method: ```...

[FEATURE REQUEST] Add Support for Qwen1.5-MoE Architecture in DeepSpeed-MII

# Qwen1.5-MoE Support With the increasing attention on mixture-of-experts (MoE) models, especially following the advancements heralded by Mixtral, I propose considering the integration of the Qwen1.5-MoE architecture, particularly its A2.7B...

Can this be used for Jambo inference

Can I use this solution for inference https://huggingface.co/ai21labs/Jamba-v0.1/discussions with offloading mamba moe layers? Jambo it SOTA open source long context model and its support would be very useful for this...

Run without quantization

QuantConfig is mandatory of make model function ```python model = build_model( device=device, quant_config=quant_config, offload_config=offload_config, state_path=state_path, ) ``` Can I run mixtral with layer offloading, but WITHOUT quntization using this library?

Alexey Mametyev

Provide logits or logprobs in the API

Problem with installation

Turn off gravity for rigid body

[FEATURE REQUEST] Add Support for Qwen1.5-MoE Architecture in DeepSpeed-MII

Can this be used for Jambo inference

Run without quantization

problem with one letter world

Problem with deepspeed finetuning

Problem with LLama training with LoRA

No answer