DeepSpeedExamples issues

DeepSpeed-FastGen support ascend npu?

3

DeepSpeed-FastGen support ascend npu, deepseek-r1-distilled-qwen2.5-32b?

Why Does vf_loss Take the Maximum Value, Rendering Clamp Meaningless?

2

critic_loss: def critic_loss_fn(self, values, old_values, returns, mask): ## value loss values_clipped = torch.clamp( values, old_values - self.cliprange_value, old_values + self.cliprange_value, ) vf_loss1 = (values - returns) ** 2 vf_loss2 =...

Morizhaoyang

KV_cache offload

3

Hi, I am using the latest huggingface transformers (version==4.48.0.dev0). When I tried to run the demo from [here](https://github.com/microsoft/DeepSpeedExamples/blob/master/inference/huggingface/zero_inference/README.md#example-3-llama2-models), I have this error: `AttributeError: 'LlamaForCausalLM' object has no attribute 'set_kv_cache_offload'`. Does...

yuzhenmao

Update Domino for Llama3

1

shenzheyu

[Draft] Add support for seq split in Domino

1

duanhx1037

add checkpoint

11

support checkpoint for domino

zhangsmallshark

Bump the pip group across 9 directories with 15 updates #3

akaday

Assertion `srcIndex < srcSelectDimSize` failed

1

When I try to run Stage 3 finetuning PPO for qwen 2 0.5B model, I got the following bug: `Assertion `srcIndex < srcSelectDimSize` failed`, which seems like issue about input...

boqiny

enable_hybrid_engine issue

10

Error Info: File "/data/rooter_use/conda/envs/llama-env39/lib/python3.9/site-packages/deepspeed/runtime/hybrid_engine.py", line 398, in step actor_loss, critic_loss = trainer.train_rlhf(exp_data) File "/data/rooter_use/DeepSpeedExamples/applications/DeepSpeed-Chat/training/step3_rlhf_finetuning/ppo_trainer.py", line 173, in train_rlhf actor_loss, critic_loss = trainer.train_rlhf(exp_data) if(self._inference_containers[0].module.attention.attn_qkvw is not None and \ File "/data/rooter_use/DeepSpeedExamples/applications/DeepSpeed-Chat/training/step3_rlhf_finetuning/ppo_trainer.py",...

llllooong

deespeed chat

hybrid engine

Is there any example about DeepSpeed Zero with Ulysses/Ulysses-offload

I only found DeepSpeed Megatron with Ulysses/Ulysses-offload

LSC527

DeepSpeedExamples
DeepSpeedExamples copied to clipboard

Metadata

DeepSpeed-FastGen support ascend npu?

Why Does vf_loss Take the Maximum Value, Rendering Clamp Meaningless?

KV_cache offload

Update Domino for Llama3

[Draft] Add support for seq split in Domino

add checkpoint

Bump the pip group across 9 directories with 15 updates #3

Assertion `srcIndex < srcSelectDimSize` failed

enable_hybrid_engine issue

Is there any example about DeepSpeed Zero with Ulysses/Ulysses-offload

← Metadata

Owner

Metadata

DeepSpeedExamples DeepSpeedExamples copied to clipboard

Metadata

← Metadata

Owner

Metadata

DeepSpeedExamples
DeepSpeedExamples copied to clipboard