Chi Zhang

Results 17 issues of Chi Zhang

### Versions - [ ] Python version: - [ ] Python architecture: - [ ] Operating system and version: - [ ] OpenDSSDirect.py version number: ### Feature Request ### Bug...

Just wonder does the current PipelineStage API supports variable length input shapes like in Megatron? https://github.com/NVIDIA/Megatron-LM/blob/e33c8f78a35765d5aa37475a144da60e8a2349d1/megatron/core/model_parallel_config.py#L212 This is particular useful for packed inputs where all the paddings are removed.

question

We are currently trying to apply torchtitan to MoE models. MoE models require using grouped_gemm https://github.com/fanshiqing/grouped_gemm. GroupedGemm ops basically follow the same rule as in ColumnLinear and RowLinear. Is there...

question

We should make them mutually exclusive by using assertion in config