DeepSpeedExamples issues

Results 274 DeepSpeedExamples issues

Sort by recently updated

Why is the PPL so high in the beginning of Step-1 (SFT)?

https://github.com/microsoft/DeepSpeedExamples/blob/737c6740bec38b77a24a59135b6481a53d566b38/applications/DeepSpeed-Chat/training/step1_supervised_finetuning/training_log_output/opt-1.3b-globalBatchSize128.log#L4 Why is the PPL here 4k when we are starting with a pretrained model?

siddharth9820

No response when running deepspeed-chat

``` JIT compiled ops requires ninja ninja .................. [OKAY] -------------------------------------------------- op name ................ installed .. compatible -------------------------------------------------- [WARNING] async_io requires the dev libaio .so object and headers but these were...

jli113

[Bug] In step3, a runtime error will be thrown when inference_tp_size>1

Desciption: In DeepSpeed-Chat step3, a runtime error: The size of tensor a (4) must match the size of tensor b (8) at non-singleton dimension 0 will be thrown when inference_tp_size>1...

haolin-nju

"RuntimeError: The size of tensor a (5120) must match the size of tensor b (20480) at non-singleton dimension 0" in step3

I have successfully run step 1 and step 2 and generated the models, but encountered an error when running step 3: "RuntimeError: The size of tensor a (5120) must match...

oolongoo

DeepSpeedExamples
DeepSpeedExamples copied to clipboard

Metadata

Why is the PPL so high in the beginning of Step-1 (SFT)?

No response when running deepspeed-chat

[Bug] In step3, a runtime error will be thrown when inference_tp_size>1

"RuntimeError: The size of tensor a (5120) must match the size of tensor b (20480) at non-singleton dimension 0" in step3

← Metadata

Owner

Metadata

DeepSpeedExamples DeepSpeedExamples copied to clipboard

Metadata

Why is the PPL so high in the beginning of Step-1 (SFT)?

No response when running deepspeed-chat

[Bug] In step3, a runtime error will be thrown when inference_tp_size>1

"RuntimeError: The size of tensor a (5120) must match the size of tensor b (20480) at non-singleton dimension 0" in step3

← Metadata

Owner

Metadata

DeepSpeedExamples
DeepSpeedExamples copied to clipboard