Reza Yazdani comments

Results 95 comments of


                                            Reza Yazdani

Multi-node inference with Bloom: Unhandled CUDA error in ProcessGroupNCCL.cpp (called from all_reduce in torch)

I am glad you could run it with large batch now! :) I think this might be related to some cache allocation issues. We are working on optimizing that part...

Errors in generation (Bloom) when changing options sampling/use_cache

@pai4451 Currently, I limit the token-length for each query to 128. I am gonna increase this soon. But, can you try with smaller length and see if the issue is...

Errors in generation (Bloom) when changing options sampling/use_cache

Regarding the batch size, I have tried with up to 128 batch and it was working fine on my side.

Errors in generation (Bloom) when changing options sampling/use_cache

Hi @mayank31398, I am still working on this. Can I ask what an average maximum number of tokens for an input would be? Potentially, this can go to as many...

Increasing the token-length based on available memory for GPT models

Hi @mayank31398, Looking into it right now, let me first merge this to another PR. I will let you know. Thanks, Reza

questuion : how to inference Int8 models (GPT) supported through ZeroQuant technology ?

Hi @xk503775229, Thanks for the interest in trying Int8 for other models. In general, you should be able to do so, however, one issue here is that you want to...

[BUG] Inference predictions dont match Huggingface for GPT-J

Hi @rahul003, I am able to repro this on my side using your script. However, when using mine, which is as follows, there is no issue with it: ``` import...

[BUG] Inference predictions dont match Huggingface for GPT-J

Hi @rahul003 , I did check again with your script and it seems the issue is regarding setting this flag `low_cpu_mem_usage` to true when creating the mode. Can you please...

[BUG] Inference predictions dont match Huggingface for GPT-J

Can you try with the above script that I pasted?

[BUG] Inference predictions dont match Huggingface for GPT-J

Also, can you please show the outputs? Thanks, Reza