fade_away
fade_away
楼主 请问下,现在怎么样了?我遇到了同样的问题,也是 math_functions.cu报错的。
> We do not currently support pipeline parallelism with MII. Thank you. I see this manual(https://www.deepspeed.ai/tutorials/pipeline/) for deepspeed, what is needed to manually implement pipeline parallelism in DeepSpeed?
> The tutorial you linked provides an example of pipeline parallelism. However, the pipeline parallelism implemented in DeepSpeed is intended for training rather than inference. For inference we do model...
> How large are the models you want to run? An alternative approach which you can try right now with MII is to have multiple model replicas with tensor parallelism....
> If you do `replica_num=4, tensor_parallel=2` on a 4 node setup (each with 2 GPU) there will still be some communication between nodes. The load balancer does a simple round-robin...
> * What command did you use to install from source? > * What version of cuda do you have installed? `pip install -e .` | NVIDIA-SMI 535.104.12 Driver Version:...
> happened to me as well on a 2L4 GCP instance, > > first ran the docker image for proper build there, > > ```shell > docker run --gpus all...
OMG it takes a lot to build the project ok, so why there is no readme???
> seems I run it ok, by modifying some python codes. here is what I did: - python build.py --hf-path=databricks/dolly-v2-3b - Add a line to tests/chat.py : `args.add_argument("--model", type=str, default="auto")`...
> Hi @sleepwalker2017, thanks for trying out the project. Are you trying to just run the chat bot, or build from source? If you are just trying to run it,...