DeepSpeed issues

Fix wrong unit of latency in flops-profiler (#2090)

Fix issue #2090 by converting microsecond to second

MoQ problem ：'str' object has no attribute 'size'

1

I got an error using MoQ on ROCM environment on AMD GPU error： ‘’‘ Traceback (most recent call last): File "train_deepspeed.py", line 694, in main(args) File "train_deepspeed.py", line 554, in...

ImNoBadBoy

bug

Fix type checking of offload optimizer before checked class was imported

3

This bug prevents to run Megatron-LM 10B offload training example

ollmer

[BUG] zero_to_fp32 ordering files incorrectly for combining shards

1

**Note** I found a bug and a fix. However, rather than directly submitting a fix and PR, I'm reporting the issue. If I have time I'll also submit a PR....

micklemouse

bug

[BUG] Floating Point Exception (core dump) at launch_attn_softmax_v2<float>

3

## Describe the bug I tried to infer gpt2 model with under code. The code use the DeepSpeed inference optimization. When I constantly repeated model inference, `floating point exception(core dump)`...

codertimo

bug

inference

[BUG] attention_mask is overwritten by dummy tensor at DeepSpeedSelfAttentionFunction

**Describe the bug** https://github.com/microsoft/DeepSpeed/pull/1705 add line to overwrite the input_mask(attention_mask) at DeepSpeedSelfAttentionFunction to dummy attention mask. Due to this code, `attention_mask` input has been ignored for all transformer models forwards....

codertimo

bug

inference

[BUG] model.eval() doesn't work with DeepSpeed Transformer Kernel

2

**Describe the bug** The traditional way of model.eval() seems doesn't work with DeepSpeed Transformer Kernel. The training flag is changed, however, the randomness is still there. **To Reproduce** I've made...

hezq06

bug

inference

[BUG] can not initialize DeepSpeed-Inference engine with deepspeed.init_inference()

2

Hello, I am new user of the DeepSpeed(DS) and I successfully trained checkpoints using DS. However, I met issue when trying to used the checkpoint for inference. I want to...

Jirigesi

bug

inference

OPT-66B memory and GPU requirement

5

Hi, I am trying to finetune the meta OPT-66B, however, our system always tells me that the memory is not enough. >Max vmem = 434.289G Max rss = 315.860G failed...

Nvpiao

inference

[REQUEST] Run DeepSpeed inference in C++

**Is your feature request related to a problem? Please describe.** What is the best way to run DeepSpeed inference in C++? **Describe the solution you'd like** Documenting if it is...

dashesy

enhancement

inference

DeepSpeed
DeepSpeed copied to clipboard

Metadata

Fix wrong unit of latency in flops-profiler (#2090)

MoQ problem ：'str' object has no attribute 'size'

Fix type checking of offload optimizer before checked class was imported

[BUG] zero_to_fp32 ordering files incorrectly for combining shards

[BUG] Floating Point Exception (core dump) at launch_attn_softmax_v2<float>

[BUG] attention_mask is overwritten by dummy tensor at DeepSpeedSelfAttentionFunction

[BUG] model.eval() doesn't work with DeepSpeed Transformer Kernel

[BUG] can not initialize DeepSpeed-Inference engine with deepspeed.init_inference()

OPT-66B memory and GPU requirement

[REQUEST] Run DeepSpeed inference in C++

← Metadata

Owner

Metadata

DeepSpeed DeepSpeed copied to clipboard

Metadata

← Metadata

Owner

Metadata

DeepSpeed
DeepSpeed copied to clipboard