Connor Holmes comments

Results 17 comments of


                                            Connor Holmes

[BUG] DeepSpeed Inference with GPT-J using batches with padding gives wrong outputs

Thanks @tomerip for looking back into this. I think this does appear to be the same underlying issue as https://github.com/microsoft/DeepSpeed/issues/2357. A fix for this will likely come from https://github.com/microsoft/DeepSpeed/pull/2433, but...

Refine quantizer for supporting larger hidden-dim and group size

Changes fixed under later memory refactor.

Quantization Support for Fastgen?

Adding quantization support is a high priority item on our roadmap! We are working to add support for this soon and as the timeline becomes more concrete will share more...

[BUG] DeepSpeed Inference reports Signal code: Integer divide-by-zero when Seq length is 4096 for GPT2

Thank you both for looking into this. I've made a PR (https://github.com/microsoft/DeepSpeed/pull/3046) to clean up this scheduling code such that it should work for our full range of supported sequence...

[BUG] unable to run inference with gpt-j-6b

Hi @abacaj, I have created a PR (https://github.com/microsoft/DeepSpeed/pull/3256) where I am now seeing results align between DeepSpeed and the HuggingFace baseline. If you could validate in your environment as well...

self.qkv_gemm_func returns ValueError: The deleter and context arguments are mutually exclusive.

Hi @publicstaticvo, thank you for reporting this issue. Currently, the Hybrid Engine is only supported for the OPT family of models, but additional model support (including GPT-J) is on our...

[BUG] DeepSpeed tries to allocate memory from GPU 0 even though include was set to go with localhost:3,5

Hi @tmatup, within DeepSpeed, we control which devices are visible by setting the `CUDA_VISIBLE_DEVICES` environment variable, as you can see in the final line in your log. The practical impact...

[BUG] AttributeError: 'Parameter' object has no attribute 'scale'

Hi all, sorry for the slow response time on this! I have created a PR (https://github.com/microsoft/DeepSpeed/pull/3256) where I am now seeing model outputs match the HuggingFace baseline. If anyone has...

add bf16 cuda kernel support

> @cmikeh2 the nv-mii test and amd test failed again. Do you think it's related to my modification? Or just need to retry? I think it’s likely unrelated. We sometimes...

[BUG] Incorrect Model Outputs When Using Beam Search

Hi @zelcookie, thanks for reporting this. I am able to reproduce with your scripts and will work on determining a root cause of this.