Mayank Mishra comments

Results 187 comments of


                                            Mayank Mishra

[Tensorboard] Log text prediction in evaluation

Hmm, @thomasw21 so, the PR I referred to above uses both HF accelerate and DS-inference libraries, depending on what we want to infer with. But it does require transformers version...

[Tensorboard] Log text prediction in evaluation

@KMFODA currently, I am planning to create a standalone library. For now, I am adding to this repo itself.

[Tensorboard] Log text prediction in evaluation

@thomasw21 , I am not sure how this differs from the PR I pointed above ^^. Can you explain?

[Tensorboard] Log text prediction in evaluation

oh, I think I understand the issue now. Maybe something like loading from the universal checkpoints and running inference etc?

Errors in generation (Bloom) when changing options sampling/use_cache

@pohunghuang-nctu can you confirm your cuda version? I was using 11.6 and getting the same issue. Using 11.3 resolved it for me. Please give it a try. Thanks

Errors in generation (Bloom) when changing options sampling/use_cache

@pohunghuang-nctu I have PyTorch installed using conda (with CUDA 11.3) and DeepSpeed and apex have been installed from master branch using CUDA 11.3

Errors in generation (Bloom) when changing options sampling/use_cache

I haven't played around that much with it. But batch size >1 is working for me.

Errors in generation (Bloom) when changing options sampling/use_cache

I only have a single node with 8 GPUS 80GB each. Are you using pipeline parallel across nodes? Does DS-inference support that?

Errors in generation (Bloom) when changing options sampling/use_cache

@pohunghuang-nctu @pai4451 thanks for letting me know about the multi-node deployment. I am guessing this would be using pipeline parallelism? However, what are the advantages of using multi-node during inference?...

Errors in generation (Bloom) when changing options sampling/use_cache

I built deepspeed from source (master branch). Also, transformers is 4.20 transformers (4.21.1) installed using pip