MoE-Infinity issues

[BUG] Running into issues pip installing moe-infinity

1

### Prerequisites - [x] I have read the [MoE-Infinity documentation](). - [x] I have searched the [Issue Tracker](https://github.com/EfficientMoE/MoE-Infinity/issues) to ensure this hasn't been reported before. ### System Information Running on...

mparthasarathy25

bug

[Feature Request]Explanation of Parallel Techniques

3

### Prerequisites - [x] I have searched existing issues and reviewed documentation. ### Problem Description May I ask what parallel techniques you have implemented? When setting CUDA_VISIBLE_DEVICES=0,1,2,3, all four cards...

BaideBear

enhancement

[Feature Request]How to measure the generation throughput(token/s)?

9

### Prerequisites - [x] I have searched existing issues and reviewed documentation. ### Problem Description I want to measure the DeepSeek-v2-Lite-Chat throughput of MoE-infinity using RTX 4080 Super(16GB).The code I...

wuooo339

enhancement

Does it support other DeepSeek models?

4

I want to inference other DeepSeek models in V100 GPU.Does it support?Such as deepseek-ai's DeepSeek-R1-Distill-Llama-70B or DeepSeek-R1-Distill-Qwen-32B?

wuooo339

[Feature Request] The Compatibility with DeepSeek V3

1

### Prerequisites - [x] I have searched existing issues and reviewed documentation. ### Problem Description Does the current code framework support DeepSeek V3? I found DeepSeek V3 model files in...

BaideBear

enhancement

Evaluating Mixtral-8×7B-Instruct-v0.1-offloading-demo on MMLU

2

Hi!I'm currently running MoE-Infinity with Mixtral-8×7B-Instruct-v0.1-offloading-demo(the quantized version) on MMLU.I encountered a failure when loading the model weights, and I’d like to know whether the MoE-Infinity algorithm is compatible with...

AugustXuan

Chore(deps): Bump pyarrow from 12.0.0 to 14.0.1

Bumps [pyarrow](https://github.com/apache/arrow) from 12.0.0 to 14.0.1. Commits ba53748 MINOR: [Release] Update versions for 14.0.1 529f376 MINOR: [Release] Update .deb/.rpm changelogs for 14.0.1 b84bbca MINOR: [Release] Update CHANGELOG.md for 14.0.1 f141709...

dependabot[bot]

dependencies

python

what differences Between the GitHub Open-Source Version and the Paper Implementation of DeepSeek-Chat-Lite

2

Thank you for your work. May I ask what the differences are between the open-source code on GitHub and the version described in the paper? I tested **deepseek-chat-lite** in bigbench,...

dnnyyq

[BUG] CUDA Error: Invalid Device Ordinal on Single GPU Setup (NVIDIA RTX 3080)

### Prerequisites - [x] I have read the [MoE-Infinity documentation](). - [x] I have searched the [Issue Tracker](https://github.com/EfficientMoE/MoE-Infinity/issues) to ensure this hasn't been reported before. ### System Information GPU: NVIDIA...

ZiweiSong96

bug

Where is the func `prefetch_experts` called?

Thanks for your work. I have read through the code, but find nowhere `prefetch_experts` is called. This function only appears in *comments* under `model` directory, where the logic now is...

lshAlgorithm

MoE-Infinity
MoE-Infinity copied to clipboard

Metadata

[BUG] Running into issues pip installing moe-infinity

[Feature Request]Explanation of Parallel Techniques

[Feature Request]How to measure the generation throughput(token/s)?

Does it support other DeepSeek models?

[Feature Request] The Compatibility with DeepSeek V3

Evaluating Mixtral-8×7B-Instruct-v0.1-offloading-demo on MMLU

Chore(deps): Bump pyarrow from 12.0.0 to 14.0.1

what differences Between the GitHub Open-Source Version and the Paper Implementation of DeepSeek-Chat-Lite

[BUG] CUDA Error: Invalid Device Ordinal on Single GPU Setup (NVIDIA RTX 3080)

Where is the func `prefetch_experts` called?

← Metadata

Owner

Metadata

MoE-Infinity MoE-Infinity copied to clipboard

Metadata

← Metadata

Owner

Metadata

MoE-Infinity
MoE-Infinity copied to clipboard