DeepSpeedExamples
DeepSpeedExamples copied to clipboard
Overflow in deepspeed-chat LoRA and BF16 mode
- Example: Deepspeed-chat
- Model: Llama2-7b-hf
- Mode: LoRA, lora_dim=128
- precision: FP16
- Output log as below:
- Question: Does the log mean it's training correctly? I found the log is different from the log of SFT and LoRA only mode, which can output loss in each step. If not correct, how to make LoRA mode run correcly?
Model Parameters: 6.927 B, Latency: 6.08s, TFLOPs: 1.72, Samples/sec: 0.66, Time/seq 1.52s, Batch Size: 4, Sequence Length: 512
Model Parameters: 6.927 B, Latency: 6.06s, TFLOPs: 1.73, Samples/sec: 0.66, Time/seq 1.52s, Batch Size: 4, Sequence Length: 512
Model Parameters: 6.927 B, Latency: 6.07s, TFLOPs: 1.72, Samples/sec: 0.66, Time/seq 1.52s, Batch Size: 4, Sequence Length: 512
Model Parameters: 6.927 B, Latency: 6.07s, TFLOPs: 1.72, Samples/sec: 0.66, Time/seq 1.52s, Batch Size: 4, Sequence Length: 512
Model Parameters: 6.927 B, Latency: 6.07s, TFLOPs: 1.72, Samples/sec: 0.66, Time/seq 1.52s, Batch Size: 4, Sequence Length: 512
Model Parameters: 6.927 B, Latency: 6.07s, TFLOPs: 1.72, Samples/sec: 0.66, Time/seq 1.52s, Batch Size: 4, Sequence Length: 512
Model Parameters: 6.927 B, Latency: 6.07s, TFLOPs: 1.72, Samples/sec: 0.66, Time/seq 1.52s, Batch Size: 4, Sequence Length: 512
Model Parameters: 6.927 B, Latency: 6.07s, TFLOPs: 1.73, Samples/sec: 0.66, Time/seq 1.52s, Batch Size: 4, Sequence Length: 512
[2023-10-20 09:49:17,004] [INFO] [logging.py:96:log_dist] [Rank 0] step=40, skipped=6, lr=[9.618683345445294e-06, 0.0004983773754116733], mom=[(0.9, 0.95), (0.9, 0.95)]
[2023-10-20 09:49:17,004] [INFO] [timer.py:260:stop] epoch=0/micro_step=40/global_step=40, RunningAvgSamplesPerSec=5.187625237365746, CurrSamplesPerSec=5.008550092748559, MemAllocated=3.9GB, MaxMemAllocated=6.71GB
Model Parameters: 6.927 B, Latency: 6.39s, TFLOPs: 1.64, Samples/sec: 0.63, Time/seq 1.60s, Batch Size: 4, Sequence Length: 512
Model Parameters: 6.927 B, Latency: 6.06s, TFLOPs: 1.73, Samples/sec: 0.66, Time/seq 1.52s, Batch Size: 4, Sequence Length: 512
Model Parameters: 6.927 B, Latency: 6.07s, TFLOPs: 1.72, Samples/sec: 0.66, Time/seq 1.52s, Batch Size: 4, Sequence Length: 512
Model Parameters: 6.927 B, Latency: 6.09s, TFLOPs: 1.72, Samples/sec: 0.66, Time/seq 1.52s, Batch Size: 4, Sequence Length: 512
Model Parameters: 6.927 B, Latency: 6.08s, TFLOPs: 1.72, Samples/sec: 0.66, Time/seq 1.52s, Batch Size: 4, Sequence Length: 512
Model Parameters: 6.927 B, Latency: 6.07s, TFLOPs: 1.72, Samples/sec: 0.66, Time/seq 1.52s, Batch Size: 4, Sequence Length: 512