Leyang Xue
Leyang Xue
Feel free to replace the example model with "mistralai/Mixtral-8x7B-Instruct-v0.1". If there's further error in running multi-GPU version, please feel free to post here
> when I run interface_example.py, I meet the errors below: Traceback (most recent call last): File "/home/admin/Documents/MoE-Infinity-main/examples/interface_example.py", line 32, in names = datasets.get_dataset_config_names(dataset_name) File "/home/admin/anaconda3/envs/moe-infinity/lib/python3.9/site-packages/datasets/inspect.py", line 347, in get_dataset_config_names dataset_module...
Still work in progress, curently it is not supported
The batch engine is not provided yet, auto-batching which specifies max batch size and max delay is the simplist way of implementation, continuous batching is WIP
Is the possible to provide the script to reproduce? If this is one of the example, please specify which one you have run. Providing hardware settings might also be helpful
We need to go for nioghtly release at the current stage
The expert parallelism cross node is implemented but not tested; current behavior is `expert_gpu_id = expert_id % num_gpu`, as default option similar to deepspeed
DeepSeek V3 support is current under way, due to lack of support of fp8 operations in pytorch, it may take sometime
do you have detailed log or concil output?
These two models are llama and qwen by themselves but distilled using R1, we plan to support Qwen later