Insu Jang
Insu Jang
I have the same issue. It is not just flops, but also macs. `module.__flops__` and `module.__macs__` are calculated in post hook in profiler code: https://github.com/microsoft/DeepSpeed/blob/80f94c10c552ec79473775adb8902b210656ed76/deepspeed/profiling/flops_profiler/profiler.py#L91-L95 `module_flop_count[-1]` and `module_mac_count[-1]` have more...
@TongLi3701 I am facing the same problem using transformers 4.36.0 and colossalai branch `feature/update-transformers`, which targets transformers 4.36.0.
@wangbluo Could you please help me solve this issue? Thanks
I used 7b configuration.
I am not sure if it is a bug or an unavoidable error due to lower precision and it was intended to test only on fp32. Would appreciate it if...
@Edenzzzz , thank you for your time looking into this issue. I am not sure if this fix works. I tested with `enable_all_optimization=False`, `enable_sequence_parallelism=False`, and `enable_sequence_overlap=False`, still the same problem...
Looks like `preprocess` in each policy might be the reason: https://github.com/hpcaitech/ColossalAI/blob/341263df48bbef1174c41b6c4f5f6785f895b0d4/colossalai/shardformer/policies/bert.py#L39-L51 https://github.com/hpcaitech/ColossalAI/blob/341263df48bbef1174c41b6c4f5f6785f895b0d4/colossalai/shardformer/policies/gpt2.py#L32-L43 Although all policies have the same resize logic, each model has different default vocab embedding size, so only...
A quick potential patch is not to use HF's `resize_token_embeddings` and use `nn.functional.pad` to resize tensor while avoiding recreation of `nn.Embedding` (not sure if there are other attributes that should...
Maybe it is related to #5489 ?