Michael Wyatt comments

Results 271 comments of


                                            Michael Wyatt

[BUG] Illegal memory access CUDA error when using long sequences

@tomeras91 I can confirm that I'm able to reproduce this error. I don't think it has anything to do with `MAX_OUT_TOKES`. @RezaYazdaniAminabadi could you take a look at this?

How to set batch size for using deepspeed inference API?

@gtvforever you can pass multiple items to the pipeline like so: `outputs = generator(["DeepSpeed is the", "It's a DeepSpeed kind of summer"], do_sample=True, min_length=50)` If you are looking to the...

DeepSpeed support with CUDA 11.2？

Could you share the output of `nvcc --version`? Also, does the install work with just `pip install deepspeed` or `pip install git+https://github.com/microsoft/DeepSpeed.git`?

[BUG] import deepspeed error when building from source

@kisseternity Are you able to run if you install from pypi (`pip install deepspeed==0.6.5`)? I've seen a similar issue in the past and it was likely due to the symlinks...

[BUG] import deepspeed error when building from source

@kisseternity thanks for sharing! I think the issue is stemming from symbolic links we create in `setup.py` - but I'm unable to reproduce this behavior in my own environment. Could...

[BUG]DeepSpeed Comm. Backend not compatible with outside torch.distributed module

@kisseternity could you please share what issue you are having when building from source?

New microsoft/bloom-deepspeed-inference-fp16 weights not working with DeepSpeed MII

The fp16 Bloom weights are now supported. Int8 models are also supported, but currently the DeepSpeed sharded int8 weights for the Bloom model will throw an error. I'm working on...

New microsoft/bloom-deepspeed-inference-fp16 weights not working with DeepSpeed MII

@mayank31398 @TahaBinhuraib I finally found the time to fix #69 so that it works with int8. You no longer need to download the sharded checkpoint files separately and MII will...

Support multiple nodes deployment?

Hi @pohunghuang-nctu we currently do not support multi-node deployments, but this feature is on our radar as we continue development of MII.

[BUG] Different outputs by original model and inference engine

@reymondzzzz I am able to reproduce these results. I'm not exactly sure why the outputs don't match. If I use the HuggingFace pipeline API, the problem goes away. Can you...