Mayank Mishra issues

Results 18 issues of


                                            Mayank Mishra

Possible memory leak when inferencing BLOOM 176B

### System Info ```Shell - `Accelerate` version: 0.11.0 - Platform: Linux-4.18.0-305.25.1.el8_4.x86_64-x86_64-with-glibc2.17 - Python version: 3.8.13 - Numpy version: 1.22.3 - PyTorch version (GPU?): 1.11.0a0+gitbc2c6ed (True) - `Accelerate` default config: Not...

bug

Add generation server scripts using HF accelerate and DS-inference

Currently, the scripts are working correctly. However, there is some memory leak. In https://github.com/huggingface/accelerate/issues/614 @sgugger says that its not in accelerate which is probably true. My hunch is a related...

New microsoft/bloom-deepspeed-inference-fp16 weights not working with DeepSpeed MII

New `microsoft/bloom-deepspeed-inference-fp16` and `microsoft/bloom-deepspeed-inference-int8` weights not working with DeepSpeed MII @jeffra @RezaYazdaniAminabadi ``` Traceback (most recent call last): File "scripts/bloom-inference-server/server.py", line 83, in model = DSInferenceGRPCServer(args) File "/net/llm-shared-nfs/nfs/mayank/BigScience-Megatron-DeepSpeed/scripts/bloom-inference-server/ds_inference/grpc_server.py", line 36,...

Enhancement: detach dtype for prompt embeddings from the model itself

I think right now, the dtype of prompt embeddings and the model are tied together since the weights are copied. It would be nice to have a different dtype for...

convert prompt tuning vocab to fp32

closes: https://github.com/huggingface/peft/issues/62

[BUG] DS-inference possible memory duplication

DS-inference runs out of memory more quickly for GPT2 than BLOOM even if they have similar number of parameters. tables for both models in the [README](https://github.com/bigcode-project/bigcode-inference-benchmark) @jeffra

bug

inference

Mayank Mishra

Possible memory leak when inferencing BLOOM 176B

Add generation server scripts using HF accelerate and DS-inference

New microsoft/bloom-deepspeed-inference-fp16 weights not working with DeepSpeed MII

Enhancement: detach dtype for prompt embeddings from the model itself

convert prompt tuning vocab to fp32

[BUG] DS-inference possible memory duplication

♻️ replace deprecated functions for communication

:bug: fix bug in recovering shared parameters when all parameters are not trainable

Where can I find MCR-DL?

♻️ cleaning some stuff