DeepSpeed-MII issues

Results 126 DeepSpeed-MII issues

Sort by recently updated

New microsoft/bloom-deepspeed-inference-fp16 weights not working with DeepSpeed MII

New `microsoft/bloom-deepspeed-inference-fp16` and `microsoft/bloom-deepspeed-inference-int8` weights not working with DeepSpeed MII @jeffra @RezaYazdaniAminabadi ``` Traceback (most recent call last): File "scripts/bloom-inference-server/server.py", line 83, in model = DSInferenceGRPCServer(args) File "/net/llm-shared-nfs/nfs/mayank/BigScience-Megatron-DeepSpeed/scripts/bloom-inference-server/ds_inference/grpc_server.py", line 36,...

mayank31398

FileNotFoundError: [Errno 2] No such file or directory: 'deepspeed'

Hello, When running the following code I get the FileNotFoundError Error. Any idea why this happens? I follow the usual install through conda (pytorch+cuda) and pip install . ``` mii_configs...

C00reNUT

fix hyperlink to paper

kamalkraj

Support multiple nodes deployment?

As subject, If I have to deploy one model into more than 1 machines, any kind of configuration could I make?

pohunghuang-nctu

enhancement

add ds inject policies

jeffra

[bloom] use mii cache dir for config/tokenizer

AML deployments the model dir is not writeable, download config/tokenizer to a writeable cache path.

jeffra

Add local AML deployment option

Provide local AML deployment option, this will use the [AML inference server](https://pypi.org/project/azureml-inference-server-http/) for the front end. We can then easily deploy an MII generated score file via: `azmlinfsrv --model_dir --entry_script...

jeffra

enhancement

Add support for HuggingFace GPT-NeoX implementation

I'm running into a CUDA OOM error when loading this model due to the large size and lack of support for multi-GPU in HF pipeline.

mrwyattii

Custom model configs

Allow the users to pass a dictionary or [transformers.PretrainedConfig](https://huggingface.co/docs/transformers/v4.19.2/en/main_classes/configuration#transformers.PretrainedConfig) when deploying models.

mrwyattii

enhancement

Expose DS-inference and ZeRO-inference configs to user

After #25 is complete we want to expose all DS-inference configs (https://deepspeed.readthedocs.io/en/latest/inference-init.html#deepspeed.init_inference) and ZeRO inference configs in the MII config dictionary.

jeffra

enhancement

DeepSpeed-MII
DeepSpeed-MII copied to clipboard

Metadata

New microsoft/bloom-deepspeed-inference-fp16 weights not working with DeepSpeed MII

FileNotFoundError: [Errno 2] No such file or directory: 'deepspeed'

fix hyperlink to paper

Support multiple nodes deployment?

add ds inject policies

[bloom] use mii cache dir for config/tokenizer

Add local AML deployment option

Add support for HuggingFace GPT-NeoX implementation

Custom model configs

Expose DS-inference and ZeRO-inference configs to user

← Metadata

Owner

Metadata

DeepSpeed-MII DeepSpeed-MII copied to clipboard

Metadata

← Metadata

Owner

Metadata

DeepSpeed-MII
DeepSpeed-MII copied to clipboard