Mayank Mishra

Results 187 comments of Mayank Mishra

Hmm, @thomasw21 so, the PR I referred to above uses both HF accelerate and DS-inference libraries, depending on what we want to infer with. But it does require transformers version...

@KMFODA currently, I am planning to create a standalone library. For now, I am adding to this repo itself.

@thomasw21 , I am not sure how this differs from the PR I pointed above ^^. Can you explain?

oh, I think I understand the issue now. Maybe something like loading from the universal checkpoints and running inference etc?

@pohunghuang-nctu can you confirm your cuda version? I was using 11.6 and getting the same issue. Using 11.3 resolved it for me. Please give it a try. Thanks

@pohunghuang-nctu I have PyTorch installed using conda (with CUDA 11.3) and DeepSpeed and apex have been installed from master branch using CUDA 11.3

I haven't played around that much with it. But batch size >1 is working for me.

I only have a single node with 8 GPUS 80GB each. Are you using pipeline parallel across nodes? Does DS-inference support that?

@pohunghuang-nctu @pai4451 thanks for letting me know about the multi-node deployment. I am guessing this would be using pipeline parallelism? However, what are the advantages of using multi-node during inference?...

I built deepspeed from source (master branch). Also, transformers is 4.20 transformers (4.21.1) installed using pip