Mayank Mishra
Mayank Mishra
### System Info ```Shell - `Accelerate` version: 0.11.0 - Platform: Linux-4.18.0-305.25.1.el8_4.x86_64-x86_64-with-glibc2.17 - Python version: 3.8.13 - Numpy version: 1.22.3 - PyTorch version (GPU?): 1.11.0a0+gitbc2c6ed (True) - `Accelerate` default config: Not...
Currently, the scripts are working correctly. However, there is some memory leak. In https://github.com/huggingface/accelerate/issues/614 @sgugger says that its not in accelerate which is probably true. My hunch is a related...
New `microsoft/bloom-deepspeed-inference-fp16` and `microsoft/bloom-deepspeed-inference-int8` weights not working with DeepSpeed MII @jeffra @RezaYazdaniAminabadi ``` Traceback (most recent call last): File "scripts/bloom-inference-server/server.py", line 83, in model = DSInferenceGRPCServer(args) File "/net/llm-shared-nfs/nfs/mayank/BigScience-Megatron-DeepSpeed/scripts/bloom-inference-server/ds_inference/grpc_server.py", line 36,...
I think right now, the dtype of prompt embeddings and the model are tied together since the weights are copied. It would be nice to have a different dtype for...
closes: https://github.com/huggingface/peft/issues/62
DS-inference runs out of memory more quickly for GPT2 than BLOOM even if they have similar number of parameters. tables for both models in the [README](https://github.com/bigcode-project/bigcode-inference-benchmark) @jeffra
This PR is for using the new communication functions `all_gather_into_tensor` instead of `_all_gather_base` `reduce_scatter_tensor` instead of `_reduce_scatter_base` If these functions are not found (for older torch versions), we default to...
This is observed during prompt tuning. Hoever, I am not sure if the solution specified here is the best solution.
Hi, I need a pointer to MCR-DL. Is it open source? where can I start looking into its codebase? @jeffra @yuxionghe